Open eimrek opened 8 months ago
Would you still want people to provide an optimade.yaml to trigger the scraper at the your end? Otherwise you have to rely on file extensions and the file header which might have many false positives (we can also extend the yaml file to support/require the new database description field to be served in the OPTIMADE metadata)
Could be that we accept either optimade.yaml
or optimade.jsonl
, but i agree that a single file would be preferable. But i think it's good to design optimake
in a way that makes the most sense by itself.
E.g. the example
config_version: 0.1.0
database_description: >-
This database contains some example CIFs.
entries:
jsonl_path: example.jsonl
I think currently it doesn't do anything, right? And in the future, would it just validate the file? But maybe this makes sense, open to discuss further.
Coming back to this (perhaps it can be closed in the original context), I would quite like optimake serve .
(or optimake serve optimade.jsonl
) to work without needing a config file, even if it throws errors about validation of the file.
yep, i agree that optimade.jsonl
doesn't need to have the config file, and we could make it optional in this case.
I am wondering if we really need to support "direct" jsonl files in the
optimade.yaml
format.Conceptually to me it seems that the current purpose of
optimake
is to generate a jsonl file from other structural data formats andoptimade.yaml
is something that help to achieve this.If we already have a jsonl file, then the only purpose I see is validation, and the
optimade.yaml
file does not seem strictly necessary.But perhaps generating jsonl files and validating them is different enough to separate them?
E.g. we could have a different
optimake
subcommand for validation (optimade validate <jsonl-file>
?)Regarding the Materials Cloud Archive service, this change would affect it, as then we should add support for a "direct" jsonl file without any
optimade.yaml
file.