chanzuckerberg / single-cell-curation

Code and documentation for the curation of cellxgene datasets
MIT License
38 stars 23 forks source link

cellxgene-schema must update validation for disease_ontology_term_id #719

Closed brianraymor closed 8 months ago

brianraymor commented 10 months ago

Context

See disease_ontology_term_id.

4.0.0 was:

This MUST be a MONDO term or "PATO:0000461" for normal or healthy

4.1.0 adds further requirements:

This MUST be one of:

joyceyan commented 9 months ago

@brianraymor @danieljhegeman Is there any way for us to know if the MONDO term provided is the most accurate term for the disease / injury? Or would the work on the validation side just be to verify that a MONDO term provided is a child of either MONDO:0000001 or MONDO:0021178?

brianraymor commented 9 months ago

Is there any way for us to know if the MONDO term provided is the most accurate term for the disease / injury?

No. This language is a common pattern for schema fields for curators to encourage the submitter to not use a general term (which often happens in the first iteration). Curators will often review the research and request the submitter to update the precision.

danieljhegeman commented 8 months ago

most accurate child

@brianraymor please confirm: a child is a direct descendant, and not a multi-level descendant, e.g. B is a child of A but C is not a child of A in: A > B > C where A is the foremost ancestor. cc @joyceyan

danieljhegeman commented 8 months ago

image

Implicitly V6 is not a child of V9

brianraymor commented 8 months ago

Let's use a concrete, simple case - MONDO.

It's any descendant of the disease root:

Screenshot 2024-01-25 at 10 22 17 AM

which unfortunately allows non-human animal disease to be specified for human observations at the moment per my comments in #cell-science-data-wrangling.

danieljhegeman commented 8 months ago

@brianraymor Ok, thanks for clarifying. I'll make a ticket to change the language in the schema and code base.

danieljhegeman commented 8 months ago

(original issue: Joyce's feature change) Approved by Lattice