Closed joyceyan closed 1 year ago
I think we're talking about the same thing @brianraymor. The CXG conversion occurs as part of the ingestion pipeline. So when we say "reused between the CXG schema CLI and CXG converter", we are referring to the validation done by the ingestion pipeline. Apologies for the confusion.
Apologies. Still confused.
2
is successful, then CXG conversion is started and doesn't need to worry about validating those cases again. So I do not understand "reused".
:facepalm:
You're absolutely correct. I now realize that the processing pipeline directly imports cellxgene_schema. This ticket should be re-titled to:
Move all implicit validation steps in the CXG converter into the CXG schema CLI
Closing this out since after doing some investigation, @seve didn't find any implicit validations in the CXG conversion code that isn't already in the CLI validator.
Can you say a bit more about why validation for CXG requirements would occur twice?
The CLI is available for curators to use prior to submission BUT the ingestion pipeline runs validation again (trust, but verify). So in theory, datasets that would result in failures during CXG conversion would fail prior to the conversion step.