lcdb / lcdb-workflows

DEPRECATED. Please see https://github.com/lcdb/lcdb-wf
MIT License
1 stars 0 forks source link

schema validation #12

Closed daler closed 8 years ago

daler commented 8 years ago

This PR adds a first draft config validation schema (in the resources dir) and a script to generate an empty, validated config.yaml from the schema.

The generation script takes the "description" field of the schema entries and converts them into comments in the generated config. The idea is that the config format is defined and documented in a single location, and the definition can be used to validate any customized configs, and the documentation is carried through to the generated configs. Specified defaults are also filled in where appropriate.

There is also support for filling in entries from site.yaml (this file is included, but currently empty). The idea here is that eventually site.yaml will be populated with paths to the indexes and references and annotations for all configured genomes at the "site" (i.e, different sites would be biowulf, travis-ci, or a user's laptop). When generating a config from schema, these paths will be filled in automatically where appropriate.

The idea here is that the work on the config file format can occur simply by modifying the schema file so that we ensure up-to-date config formats when running tests.

Example usage:

python lcdb/generate_config_from_schema.py --schema resources/config.schema.yaml --genome dm6 --site resources/site.yaml --out example_config.yaml