haskell / ghcup-hs

https://www.haskell.org/ghcup/
GNU Lesser General Public License v3.0
267 stars 80 forks source link

Use Dhall for configuration #60

Open hasufell opened 3 years ago

hasufell commented 3 years ago

In GitLab by @hekkaidekapus on Sep 1, 2020, 09:07

cc @amesgen

[Edit, 19 Oct. 20] The original ticket referenced a snippet which is now bundled in a tarball in a comment down in the discussion thread.

hasufell commented 3 years ago

In GitLab by @amesgen on Sep 3, 2020, 08:40

Maybe you already discussed this, but it is not clear to me if you suggest to

  1. use the dhall Haskell bindings to parse the .dhall file at runtime, or
  2. use dhall-to-{json,yaml} to create the JSON/YAML file, with no code changes to ghcup.

Some remarks:

I am in favour of option 2. WDYT?


I don't really understand the second point in your outline:

There should be minor differences between the Haskell data types and the Dhall expression. Preferably, on one hand the Dhall expression’s type should feature in haddocks on the Haskell side and, on the other hand, the Haskell data types should be written down in the main .dhall file. That way, both sides would be reminded to never let the types evolve in opposing directions.

Why not just check (e.g. in CI) that the ghcup-0.0.x.{dhall,json,yaml} file can be parsed by ghcup? Copy-pasting the types everywhere seems very convoluted to me. Maybe I am misunderstanding something.

hasufell commented 3 years ago

In GitLab by @maerwald on Sep 3, 2020, 14:02

http-client-tls is a no-go for this codebase:

Wrt generating yaml... I can just use plain haskell to do so and in fact that's what I did before. I removed that step, because it didn't add much value, but more maintenance burden and ppl were complaining how it's overcomplicated.

hasufell commented 3 years ago

In GitLab by @amesgen on Sep 3, 2020, 20:55

http-client-tls is a no-go for this codebase:

http-client-tls is only used for remote import resolution (which I think nobody is arguing to use for this use case), and it is only used by default. My point was mostly about compile time and binary size, as I am not sure if e.g. -split-sections is able to remove the transitive http-client-tls ecosystem. See this comment.

I can just use plain haskell to do so and in fact that's what I did before. I removed that step, because it didn't add much value, but more maintenance burden and ppl were complaining how it's overcomplicated.

Maybe I misunderstood something, but I thought that you were in favour of switching to Dhall in the form outlined above by @hekkaidekapus (I am not really interested in discussing this). Whether we parse the Dhall directly or first convert it into YAML appears to be an unrelated implementation decision to me.

hasufell commented 3 years ago

In GitLab by @amesgen on Sep 3, 2020, 21:27

Sorry, my point about compile time/binary size issues with http-client-tls does not apply, as one can disable the http-client-tls dependency with the with-http flag.

hasufell commented 3 years ago

In GitLab by @hekkaidekapus on Sep 4, 2020, 02:05

@amesgen, thank you for joining us.

Under option (1), ghcup-hs/lib/GHCup/Types.hs will change so that the types are defined strictly from the Dhall expression. This will induce a refactor elsewhere in the codebase because there would be no need to deal with JSON and YAML anymore. See (i) for an example of what that means tangibly. Under option (2), @amesgen is proposing to keep the internals unchanged; the Dhall expression will be converted to YAML on the command line and the generated YAML would be compatible with the current codebase.

I think @maerwald views option (2) as moot because there is already code somewhere to automatically get YAML and introducing Dhall to do exactly the same would not present much upside. (1) surely requires more effort than (2) but it seems to me we should give it a chance if we really think it is worth using Dhall. All in all, I suspect that getting rid of the JSON & YAML in the codebase will be a huge simplification.

@amesgen says:

I don't really understand the second point in your outline: [my comments about writing down types in haddocks]

That was an attempt to accommodate @maerwald concerns regarding impedance mismatch between the Haskell and Dhall sides. Since we might toss out (2), those comments are moot. With (1), the types are guaranteed to be the same and as soon as something changes in the Dhall expression, the Haskell code will keep failing at compilation time until the modifications are accounted for on the Haskell side.

@maerwald, could you please try the sample-1 package and tell us what you think. You will see there is no TemplateHaskell involved, and the http-client-tls concerns have been addressed by @amesgen.

(i) Extracting the configuration from Dhall: sample-1.tar.gz

hasufell commented 3 years ago

In GitLab by @amesgen on Sep 4, 2020, 03:41

I think @maerwald views option (2) as moot because there is already code somewhere to automatically get YAML and introducing Dhall to do exactly the same would not present much upside.

Parsing Dhall directly vs. using JSON/YAML as an intermediary format is completely orthogonal to whether we write the metadata file in Dhall or in JSON/YAML. The entire point of using Dhall is that it would enable use to do things that YAML cannot do (e.g. functions for less repetition and easier extensibility).

The entire point of e.g. dhall-kubernetes is to use Dhall to generate a JSON/YAML file as Dhall is more pleasant to use.

(1) surely requires more effort than (2) but it seems to me we should give it a chance if we really think it is worth using Dhall.

Why? I don't understand why the choice to use Dhall (which I support) results in a bias towards 1.

All in all, I suspect that getting rid of the JSON & YAML in the codebase will be a huge simplification.

Why? What concrete simplifications do you have in mind? The serialization boilerplate will still have to exist in some way, just with FromDhall/ToDhall (or using Decoder directly, which is besides the point) instead of FromJSON/ToJSON.


@hekkaidekapus Your code example makes a big mistake IMHO:

data Config = MkConf
    { cabalInstall2410      :: Bundle
    , cabalInstall3000      :: Bundle
    , cabalInstall3200      :: Bundle
    , cabalInstall3400_rc1  :: Bundle
    , ghc7103               :: Bundle
    , ghc802                :: Bundle
-- omitted for brevity
    , ghc884                :: Bundle
    , ghc8101               :: Bundle
    , ghc8102               :: Bundle
    , ghc901_alpha_1        :: Bundle
    , ghcupLatest           :: Bundle
    , ghcRequirements       :: Map Text Requirement
    } deriving stock Show

cfgDecoder :: Decoder Config
cfgDecoder = record ( MkConf
    <$> field "cabalInstall2410" toolReleaseDecoder
    <*> field "cabalInstall3000" toolReleaseDecoder
    <*> field "cabalInstall3200" toolReleaseDecoder
    <*> field "cabalInstall3400_rc1" toolReleaseDecoder
    <*> field "ghc7103" toolReleaseDecoder
    <*> field "ghc802" toolReleaseDecoder
-- omitted for brevity
    <*> field "ghc884" toolReleaseDecoder
    <*> field "ghc8101" toolReleaseDecoder
    <*> field "ghc8102" toolReleaseDecoder
    <*> field "ghc901_alpha_1" toolReleaseDecoder
    <*> field "ghcupLatest" toolReleaseDecoder
    <*> field "toolRequirements" (map strictText toolRequirementDecoder) )

Apart from being boilerplate par excellence, with this code, the Haskell parsing code has to be updated every time a new GHC/Cabal version is added! I think it is better to parse the config into several Map Versioning Stuff.

hasufell commented 3 years ago

In GitLab by @hekkaidekapus on Sep 4, 2020, 16:01

I get your point about Dhall as a metadata. But as far as I can tell, once ghcup has parsed the configuration file, it does not really need the YAML/JSON any more. Surely, there are things like parsing arguments related to a custom bindist—the current syntax for that is JSON—and that case seems to have no relation with the configuration. Put it differently, once the YAML/JSON/Dhall data have been represented in Haskell, when would ghcup need to deal with those formats again?

As of the boilerplate, keep in mind that @maerwald is a Dhall sceptic 😉 Both my Dhall and Haskell samples constitute rather a proof-of-concept than anything else. You have certainly noticed that I mixed various programming styles in the Dhall expression; there is even a mix of indentation styles! And as can be observed in the excerpt above, I mixed plain records with maps. All of that variety is intended to get the ghcup’s maintainer accustomed to facilities offered by Dhall. In the long run, maerwald’s opinion matters the most, and it is better to lay available possibilities out if opinions are to be well informed.

So, you would rather use only maps; great! So would I. As I wrote elsewhere, feel free to make any changes in the design, I might have got the ball rolling on this (an inaccurate statement given your earlier attempt), but I really hope you will see to it that the whole thing results in a proper MR ready to be merged.

@maerwald, what do you think so far? A quick summary of where we are for now:

i) You wanted to avoid TemplateHaskell. As evidenced by the POC, that is feasible.

ii) http-client-tls is to be avoided. Fine.

iii) Types in Dhall and in Haskell should be kept in sync. Doable, as seen in the POC.

iv) In @amesgen’s latest post, some design preferences were given, along with the rationale behind them. How would they fair with your future plan for this project?

hasufell commented 3 years ago

In GitLab by @maerwald on Sep 13, 2020, 05:58

So, you would rather use only maps; great! So would I. As I wrote elsewhere, feel free to make any changes in the design, I might have got the ball rolling on this (an inaccurate statement given your earlier attempt), but I really hope you will see to it that the whole thing results in a proper MR ready to be merged.

I don't intend to work on dhall any time soon. The advantages of using Dhall to generate yaml are negligible to me. I can use haskell for that and did that for quite some time in ghcup.

Consuming Dhall directly seems more interesting. The problem with that is this though: I'll have to drop the yaml file (this would require ghcup versions still using yaml to be updated before they receive new GHC versions... this already happened for the json->yaml conversion and it has confused users before) or I have to maintain the yaml file for the foreseeable future as well (this isn't much work, but now it's an extra step again, which I just tried to avoid).

I'm still not opposed to trying that out... but only if someone really puts all the work into it (like... a PR that actually works).

hasufell commented 3 years ago

In GitLab by @amesgen on Sep 13, 2020, 06:47

I don't intend to work on dhall any time soon. The advantages of using Dhall to generate yaml are negligible to me. I can use haskell for that and did that for quite some time in ghcup.

I am repeating myself here, but I sincerely don't understand why one can be interested in using Dhall "directly", but not in using Dhall to generate an intermediary format, as I tried to substantiate above. I don't see any advantages of parsing the Dhall directly in contrast to consuming the intermediary format generated by Dhall, only downsides (like a big new dependency).

I am personally not interested in implementing direct Dhall support, as I consider it unnecessary, as using Dhall to generate JSON/YAML has all the benefits (in particular it is perfectly backwards compatible in the sense you mentioned) with none of the work IMHO.


Addendum to the http-client-tls issue: I added a flag to Dhall 1.35 to disable http-client-tls while still retaining HTTP import support.

hasufell commented 3 years ago

In GitLab by @maerwald on Sep 13, 2020, 19:39

I am repeating myself here, but I sincerely don't understand why one can be interested in using Dhall "directly"

Because I can just restore the haskell code that generated the json/yaml and tweak it. That means one tool less to learn for a contributor (I can safely assume they know haskell) and I have all what dhall offers and even more expressivity. What's the point?

hasufell commented 3 years ago

In GitLab by @amesgen on Sep 14, 2020, 02:46

Well, I actually brought up tweaking the Haskell code. I just don't understand why you are interested in parsing Dhall directly, but not in using it to generate an intermediary format. In both cases, the Dhall file looks exactly the same. I can just repeat myself (also see this comment above):

I don't see any advantages of parsing the Dhall directly in contrast to consuming the intermediary format generated by Dhall, only downsides (like a big new dependency).

Note that I don't have horse in this race, I only participate here as @hekkaidekapus mentioned me. Personally, I find the current YAML situation quite acceptable (though I still like Dhall more), as I mentioned previously.

hasufell commented 3 years ago

In GitLab by @hekkaidekapus on Sep 14, 2020, 15:39

I think we now have a clear path forward. The fact that there are seemingly incompatible views for the future does not mean we have to ignore the common ground we reached for the present. The way forward I see is letting @amesgen open an MR that will get thorough review. The MR will be a first iteration upon which a compatibility layer could be designed for previous ghcup releases, should later release introduce breaking changes.

At the time being, the MR will perhaps seem redundant but it will be a stepping stone for future parties interested in going full Dhall. Furthermore, it will give @maerwald a feel of Dhall with an easy way to fall back to the current implementation if the new code is deemed going awry.

In short, let adopt an incremental approach. There is no need to produce a polished product when we are experimenting.

hasufell commented 3 years ago

In GitLab by @maerwald on Sep 16, 2020, 03:18

I don't have a very strong opinion, because I think all of this is really minor and can by changed around as we please. But someone else will have to do the work and it doesn't appear that someone is particularly committed to that right now.

hasufell commented 3 years ago

In GitLab by @hekkaidekapus on Sep 16, 2020, 21:58

@amesgen, the latest reply is the greenest light you will get, I think.

hasufell commented 3 years ago

In GitLab by @hekkaidekapus on Oct 19, 2020, 17:38

I’m closing this since I don’t plan implementing the consensual design any time soon.

Thank you both for the discussion.

hasufell commented 3 years ago

In GitLab by @hekkaidekapus on Oct 19, 2020, 17:38

closed

hasufell commented 3 years ago

In GitLab by @maerwald on Jun 13, 2021, 05:37

I've re-opened this... maybe someone will take a hack at this during ZuriHac?

hasufell commented 1 year ago

Link to the patch: https://gitlab.haskell.org/haskell/ghcup-hs/uploads/f117d796a1f8f0e95f545ca454ba8de1/sample-1.tar.gz