chanzuckerberg / cryoet-data-portal-backend

CryoET Data Portal API server & ingestion scripts
MIT License
1 stars 2 forks source link

Update schema in preparation for dataset config validation #128

Closed daniel-ji closed 1 month ago

daniel-ji commented 1 month ago

The second of four PRs, as part of dataset configuration file validation. To be merged second. Updates schema files with:

Updates were based on template.yaml. Also improves / fixes schema.py schema generation. Metadocs/ folder is re-make-d.

Files that were primarily changed (and are not code-gen'd files):

Note that any_of is not currently working for LinkML Pydantic code generation. So some schema updates are commented out and waiting on it. Specifically, we cannot use the newly created types (like FloatFormattedString) accept both formatted strings and floats/ints for fields, so we plan on parsing the formatted string before running validation as a temporary workaround for now.

Note that the Pydantic models were also generated with a temporary-hotfix that hasn't been yet implemented by linkml (see https://github.com/linkml/linkml/pull/2201). If the docs are regenerated without this hotfix, validation will be incorrect, but only more lax (required multivalued fields that are not provided in config files will not be correctly reported as errors). In the near future, I can look into creating a Python wrapper class / using jinja2 templates to extend the current LinkML release to provide a consistent fix.