disorderedmaterials / dissolve

Structure refinement software for total scattering data
GNU General Public License v3.0
9 stars 16 forks source link

Standard file format for input file #899

Closed rprospero closed 2 years ago

rprospero commented 2 years ago

The input file is currently a bespoke file format that requires a custom parser. Using an existing file format (e.g. JSON, TOML) would eliminate a large amount of boilerplate code from our maintanence burden. Additionally, it would allow users to tap into an existing tooling eco-system for creating, inspecting, and modifying the input files.

Whatever format is chosen for the input file should be the same format as chosen for the restart file in #832

IVlaD17 commented 2 years ago

List of file formats to look at:

After looking at all of those options, we have decided to go with TOML as the file format and toml11 as the library we'll be using.

rprospero commented 2 years ago

Just as a lark, we could look at Dhall. It does have some advantages compared to the other formats and it can be trivially converted down to any of the other formats. On the other hand, I can't imagine that it has good library support.

IVlaD17 commented 2 years ago

Ideally, we'd like to not have to add another dependency in the shape of a parser, but if we have to, we should aim to get something widely used, well-supported and robust.

rprospero commented 2 years ago

A good set of criterion to look at when evaluating the formats

Library Support

Hand Editing

Tooling Support

Future Proofing

Performance

IVlaD17 commented 2 years ago

XML

Library Support

There's multiple libraries for XML but since we're already using pugixml there is not much point considering others because integrating those would take a greater amount of time which would negate the advantage of already using a library.

There is one other library that should be mentioned: RapidXML.

IVlaD17 commented 2 years ago

JSON

JSON doesn't look friendly to edit/read by hand so we'll be focusing on XML, YAML and TOML.

IVlaD17 commented 2 years ago

YAML

Library Support

There seem to be only 4 YAML libraries available for parsing and again these can be found at yaml.org. There's no clear-cut answer as to what's the best so some experimentation might be in order.

IVlaD17 commented 2 years ago

TOML

Library Support

Regarding TOML, I can only find one library that's heavily featured when investigating this and that library is toml++.

Hand Editing

I'm going to link this opinion piece on why this person thinks TOML is not a solid choice for what we're doing because he also discusses readability and maintainability from the perspective of a developer and a user. The guy's trying to promote his StrictYAML format instead so take everything here with a grain of salt.

IVlaD17 commented 2 years ago

Dhall

It appears that the links at How to integrate Dhall are broken. I personally consider this to be a bad sign in regards to implementing it so I will leave it without investigating it any more and I will instead focus on the other formats unless there's a very specific desire to look into this further.

rprospero commented 2 years ago

The following system test files can be run while only parsing the Species, Master, and PairPotential blocks: