chanzuckerberg / single-cell-curation

Code and documentation for the curation of cellxgene datasets
MIT License
38 stars 23 forks source link

Add tissue mappings #383

Closed brianraymor closed 6 months ago

brianraymor commented 1 year ago

Context

See single-cell-census.

Definition: _high-level mapping of a tissue, e.g. “Heart” is the tissuegeneral of “Heart left ventricle”

All high-level mappings (1-1 or tree) of tissue MUST be defined in the dataset schema. When a dataset is uploaded, cellxgene Data Portal MUST automatically add the relevant mapping. The results will be consumed by all cellxgene experiences for consistency.


Two tissue mappings are implemented in CELLxGENE Discover:

  1. Discover UX Tissue Filter maps a tissue to its System and Organ.
  2. Gene Expression maps a tissue to its "high-level" tissue . Also see Available Tissues in Gene Expression.
  3. Census copied both the Gene Expression model and code. Both tissue_general and tissue_general_ontology_term_id are defined in the census schema.

Research

The Tissue mappings and the cellxgene-schema CLI all calculate ancestors and need a common approach. Also see:


There are open questions in Proposal: Standardization of ontology-derived process and assets of CELLxGENE Discover about whether such information is required to be embedded in datasets rather than be returned by an API.

brianraymor commented 1 year ago

Per the June 13 tissue mappings - there can be only one call with @BAevermann @jahilton @signechambers1 @pablo-gar :

@jahilton proposed an ordered list model -

[
  most narrow UBERON term which conceptually maps to curated tissue general / organ,
   ... ,
  most broad UBERON term which conceptually maps to curated system
]

Cell Census and Gene Expression will prefer the most narrow term. Discover UX Filter will prefer both most narrow and broad.

The curated lists will be documented in the schema and updated on a regular cadence.

Next steps:

  1. @brianraymor to write the schema definition and requirements for review.
  2. @pablo-gar @jahilton and other SME(s) to re-curate and finalize the curated lists for Schema 4.
  3. The list must be reviewed with @BAevermann and @norbid to ensure that objectives in Target Rationale are addressed
brianraymor commented 1 year ago

Per cell census schema triage, closing schema mappings in favor of returning mappings in an ontology service.

See Integrate all required ontology processing into ontology preparation

brianraymor commented 11 months ago

Re-opening per conversation on #cell-science-census.

brianraymor commented 6 months ago

Not required for the dataset schema.