eayus / sirdi

Package manager for Idris
GNU General Public License v3.0
35 stars 11 forks source link

Configuration File Format #3

Closed eayus closed 2 years ago

eayus commented 2 years ago

Currently we are using JSON for config files. However I don't think this is ideal as it is not the nicest format for humans (having to quote keys, no comments, etc.).

There are a few alternative options:

I'd say the biggest requirement on the format is that it should be easy to use for an inexperienced user. This means high quality error messages, and a simple to write format. The more complex the format, the better error messages needed to compensate.

Interested to hear others' thoughts on this.

alexhumphreys commented 2 years ago

Hi! As the author of I'm guessing the idris dhall implementation you're talking about, needless to say I'd vote dhall. But I do agree with your points about dhall.

It is pretty verbose, especially in cases like:

So that's tough to change, but error messages are easier to improve. I've did some work over the summer to get source file positioning for errors, so you should get squiggly ^^ markers under the term that's faulty. But the text of the errors is still pretty shoddy. The dhall-Haskell implementation actually has some pretty good error messages. So it could be a matter of copying them from there to the idris implementation here, though there would be a bit of work aligning the two error types.

Parse errors still ain't great unfortunately, I'm not a pro at using Text.Parser.

So yeah, would need some work before it's ready for prime time, and worth noting it's also not standards compliant. But it's a fun little language, a fun compiler to work on. It also uses many similar techniques to Idris2 itself, so shouldn't be too tough for others to help out on should they so wish, as I've been a bit slow recently on working on it.

i ain't much of a toml fan, but there is another implementation here that should be relatively close to compiling if your looking for one to play with: https://github.com/idris-community/inigo/tree/main/Base/Toml. I hear it also ain't standards compliant, but then the toml standard is a lot less involved than the dhall one.

eayus commented 2 years ago

It's good to know that the error message can be improved for Dhall. I actually really like the idea of Dhall, and I think some of its features could be very useful for package files. For example,

However, since this package manager is becoming "official", I'm very worried about usability for new users, and I feel like type annotations might confuse people (especially those who are new to Idris). Probably the best way to offset this kind of problem is by making sure that the "new project" command generates a complete config file with every field and it's type annotation.

It'd be interesting to hear others' thoughts on this too. With a decision like this, it's very difficult to go back and change to a non-backwards compatible format, and so it's nice to make the right decision in the first place.

JonathanLorimer commented 2 years ago

I personally like dhall. Can you not generate a type definition for the entire config schema when some runs sirdi new? Also I think its fair to assume some familiarity or interest in learning type systems, even from someone brand new to idris.

eayus commented 2 years ago

Can you not generate a type definition for the entire config schema when some runs sirdi new?

Yep this is definitely possible, and is probably the most robust solution if we were to choose Dhall.

It's looking like lots of people are enthusiastic about Dhall, so I'm swaying in that direction. We would definitely need to put in effort on improving parser error messages, but as long as people are willing to work on fixing that then hopefully that won't be an issue for long.

Kiiyya commented 2 years ago

Simplicity is key, I think TOML (or some other common markup language, YAML?) would be a better choice. But I may be biased, having been spoiled by Rust's cargo.

Either way, that's just the markup language. I think we should also consider the semantics of it. And again we should value simplicity. I would like to see the modules and main fields gone and using a "sensible defaults" and good conventions approach, similar to how Cargo.toml is structured.

eayus commented 2 years ago

cargo is definitely the gold standard for package managers. I keep flip-flopping back and forth on config formats since there are advantages for both approaches - it might just come down to whichever has the better library.

@Kiiyya for the modules and main fields, I'm not quite sure its a good idea to remove them. It'd be reasonable to make the main field optional and give it a default of a module called Main, but it would have to be overridable so existing projects can easily port to sirdi. For the modules field, I don't see a way to avoid it if you are writing a library - Idris doesn't provide any mechanism in the language to specify what modules should be exposed outside that package.

claymager commented 2 years ago

Sensible defaults is a good idea; removing them entirely is a harder sell. There are currently a number of projects that have multiple ipkgs for the same source, such as Idris itself.

I like dhall because it (in theory) allows us to be precise about the interface, but very clean parser code and good documentation probably accomplish the same goal.

I've tried using TOML for describing packages like idris2-jupyter-vega, which requires libraries which require a nonstandard executable and command, and it basically turns into Nix with a different syntax. I don't know if we'd even want to support something like that, but at least we can do some reasoning with dhall.

dagit commented 2 years ago

I think there's some different dimensions to consider in weighing options there. Let me throw out a few that come to mind.

I'm not personally a huge fan of toml but I do think it sits at a pragmatic happy medium for a lot of these dimensions in terms of trade-offs. I'm a bit concerned about the choice of dhall for the following reasons:

I'm partial to a bespoke notation, but when I try to factor out my personal biases, I think toml is probably the right choice for sirdi. It's certainly not a risky choice. I think bespoke and dhall have the highest potential to be interesting choices but also come with the most risk. As for json and yaml, I would say please use toml instead.

eayus commented 2 years ago

The TOML library (https://github.com/cuddlefishie/toml-idr) has been updated to work with the latest Idris2 version, so I'm going to have a play with that

eayus commented 2 years ago

Ok I think I've decided on TOML. I don't like to make "executive decisions" when the agreement is not unanimous, but I think this issue has been open long enough, and not having a concrete configuration file format is the main barrier to more serious adoption of sirdi.

My reasons are:

sullyj3 commented 2 years ago

I like dhall, but I agree that it's important to use something ubiquitous and simple both for ease of beginners and machine processing. I agree with the choice of TOML.

It seems to me like people who want to use dhall should be able to use it as an extra layer to generate the TOML?

mattpolzin commented 2 years ago

Is this being actively worked on by anyone? Not sure I'll have time but if no one else is working on it anyway then it won't hurt if I tentatively plan to.

eayus commented 2 years ago

@mattpolzin I've been messing around with re-structuring sirdi, and as part of that I have written a TOML configuration file parser. I'll push the changes soon

eayus commented 2 years ago

I've pushed the WIP changes, including the TOML config parser, to therefactor branch. There is lots of intentional behavioural changes to sirdi, as well as unintentional regressed behaviour, but I felt the project could do with a decent rewrite as 90% of the code base was just hacked together in the developer meeting.

eayus commented 2 years ago

Closing this issue since once the refactor branch is merged we will be using TOML.