chanzuckerberg / single-cell-curation

Code and documentation for the curation of cellxgene datasets
MIT License
38 stars 23 forks source link

cellxgene-schema CLI must update validation for obs['self_reported_ethnicity_ontology_term_id'] #811

Closed brianraymor closed 6 months ago

brianraymor commented 7 months ago

Context

self_reported_ethnicity_ontology_term_id must not allow duplicate HANCESTRO terms when multiple terms are present:

... the value MUST be formatted as one or more comma-separated (with no leading or trailing spaces) HANCESTRO terms in ascending lexical order with no duplication of terms or "unknown" if unavailable.

See self_reported_ethnicity_ontology_term_id for the full requirements.

nayib-jose-gloria commented 6 months ago

this behavior is already captured by validator CLI as of schema 4.0.0 release.