Closed eayus closed 2 years ago
Hi! As the author of I'm guessing the idris dhall implementation you're talking about, needless to say I'd vote dhall. But I do agree with your points about dhall.
It is pretty verbose, especially in cases like:
So that's tough to change, but error messages are easier to improve. I've did some work over the summer to get source file positioning for errors, so you should get squiggly ^^
markers under the term that's faulty. But the text of the errors is still pretty shoddy. The dhall-Haskell implementation actually has some pretty good error messages. So it could be a matter of copying them from there to the idris implementation here, though there would be a bit of work aligning the two error types.
Parse errors still ain't great unfortunately, I'm not a pro at using Text.Parser
.
So yeah, would need some work before it's ready for prime time, and worth noting it's also not standards compliant. But it's a fun little language, a fun compiler to work on. It also uses many similar techniques to Idris2 itself, so shouldn't be too tough for others to help out on should they so wish, as I've been a bit slow recently on working on it.
i ain't much of a toml fan, but there is another implementation here that should be relatively close to compiling if your looking for one to play with: https://github.com/idris-community/inigo/tree/main/Base/Toml. I hear it also ain't standards compliant, but then the toml standard is a lot less involved than the dhall one.
It's good to know that the error message can be improved for Dhall. I actually really like the idea of Dhall, and I think some of its features could be very useful for package files. For example,
However, since this package manager is becoming "official", I'm very worried about usability for new users, and I feel like type annotations might confuse people (especially those who are new to Idris). Probably the best way to offset this kind of problem is by making sure that the "new project" command generates a complete config file with every field and it's type annotation.
It'd be interesting to hear others' thoughts on this too. With a decision like this, it's very difficult to go back and change to a non-backwards compatible format, and so it's nice to make the right decision in the first place.
I personally like dhall. Can you not generate a type definition for the entire config schema when some runs sirdi new
? Also I think its fair to assume some familiarity or interest in learning type systems, even from someone brand new to idris.
Can you not generate a type definition for the entire config schema when some runs
sirdi new
?
Yep this is definitely possible, and is probably the most robust solution if we were to choose Dhall.
It's looking like lots of people are enthusiastic about Dhall, so I'm swaying in that direction. We would definitely need to put in effort on improving parser error messages, but as long as people are willing to work on fixing that then hopefully that won't be an issue for long.
Simplicity is key, I think TOML (or some other common markup language, YAML?) would be a better choice. But I may be biased, having been spoiled by Rust's cargo
.
Either way, that's just the markup language. I think we should also consider the semantics of it. And again we should value simplicity. I would like to see the modules
and main
fields gone and using a "sensible defaults" and good conventions approach, similar to how Cargo.toml
is structured.
cargo
is definitely the gold standard for package managers. I keep flip-flopping back and forth on config formats since there are advantages for both approaches - it might just come down to whichever has the better library.
@Kiiyya for the modules
and main
fields, I'm not quite sure its a good idea to remove them. It'd be reasonable to make the main
field optional and give it a default of a module called Main
, but it would have to be overridable so existing projects can easily port to sirdi
. For the modules
field, I don't see a way to avoid it if you are writing a library - Idris doesn't provide any mechanism in the language to specify what modules should be exposed outside that package.
Sensible defaults is a good idea; removing them entirely is a harder sell. There are currently a number of projects that have multiple ipkgs for the same source, such as Idris itself.
I like dhall because it (in theory) allows us to be precise about the interface, but very clean parser code and good documentation probably accomplish the same goal.
I've tried using TOML for describing packages like idris2-jupyter-vega, which requires libraries which require a nonstandard executable and command, and it basically turns into Nix with a different syntax. I don't know if we'd even want to support something like that, but at least we can do some reasoning with dhall.
I think there's some different dimensions to consider in weighing options there. Let me throw out a few that come to mind.
I'm not personally a huge fan of toml but I do think it sits at a pragmatic happy medium for a lot of these dimensions in terms of trade-offs. I'm a bit concerned about the choice of dhall for the following reasons:
I'm partial to a bespoke notation, but when I try to factor out my personal biases, I think toml is probably the right choice for sirdi. It's certainly not a risky choice. I think bespoke and dhall have the highest potential to be interesting choices but also come with the most risk. As for json and yaml, I would say please use toml instead.
The TOML library (https://github.com/cuddlefishie/toml-idr) has been updated to work with the latest Idris2 version, so I'm going to have a play with that
Ok I think I've decided on TOML
. I don't like to make "executive decisions" when the agreement is not unanimous, but I think this issue has been open long enough, and not having a concrete configuration file format is the main barrier to more serious adoption of sirdi
.
My reasons are:
TOML
seems to be the least controversial option. It might not be everyone's favourite, but it at least seems most people are fine with it. It seems to be a happy medium.dhall
and bespoke formats can be exciting, but they are inherently more complex.TOML
is very accessible for new users. It is already used for the same purpose in rust
's cargo
, and uses a common .ini
-like syntax that many people will already be familiar with. It also has better editor support by default (for example, my default vim
installation already supports TOML
syntax highlighting).I like dhall
, but I agree that it's important to use something ubiquitous and simple both for ease of beginners and machine processing. I agree with the choice of TOML
.
It seems to me like people who want to use dhall
should be able to use it as an extra layer to generate the TOML
?
Is this being actively worked on by anyone? Not sure I'll have time but if no one else is working on it anyway then it won't hurt if I tentatively plan to.
@mattpolzin I've been messing around with re-structuring sirdi, and as part of that I have written a TOML configuration file parser. I'll push the changes soon
I've pushed the WIP changes, including the TOML config parser, to therefactor
branch. There is lots of intentional behavioural changes to sirdi, as well as unintentional regressed behaviour, but I felt the project could do with a decent rewrite as 90% of the code base was just hacked together in the developer meeting.
Closing this issue since once the refactor
branch is merged we will be using TOML.
Currently we are using JSON for config files. However I don't think this is ideal as it is not the nicest format for humans (having to quote keys, no comments, etc.).
There are a few alternative options:
dhall
- I like the idea, however in practice it seems to be quite verbose. For example, you need to specify a type annotation if you use an empty list. Furthermore, when I tried the Idris implementation, the error messages were very confusing.toml
- Probably my favourite language design-wise. However, the Idris implemetnation doesn't seem to be compiling atm (https://github.com/cuddlefishie/toml-idr). Also I have no idea how robust the implementation is, as I've been unable to test it.I'd say the biggest requirement on the format is that it should be easy to use for an inexperienced user. This means high quality error messages, and a simple to write format. The more complex the format, the better error messages needed to compensate.
Interested to hear others' thoughts on this.