chanzuckerberg / single-cell

A collection of documents that reflect various design decisions that have been made for the cellxgene project.
MIT License
4 stars 2 forks source link

Add tissue_type #459

Closed brianraymor closed 9 months ago

brianraymor commented 1 year ago

Note: This is a placeholder epic for Dev to add child issues following an assessment of required changes.

The changes to the cellxgene-schema CLI are tracked separately in cellxgene-schema must validate tissue_type

Design

This also changes the validation and label application for tissue_ontology_term_id and tissue.

See tissue_type.

tissue_type

Key tissue_type
Annotator Curator
Value categorical with str categories. This MUST be "tissue", "organoid", or "cell culture".


tissue_ontology_term_id

Key tissue_ontology_term_id
Annotator Curator
Value categorical with str categories. If tissue_type is "tissue" or "organoid", this MUST be the most accurate child of UBERON:0001062 for anatomical entity.

If tissue_type is "cell culture" this MUST follow the requirements for cell_type_ontology_term_id.


tissue

Key tissue
Annotator CELLxGENE Discover
Value categorical with str categories. This MUST be the human-readable name assigned to the value of tissue_ontology_term_id.

Data Platform

Note: Still sketching out the potential changes

  1. tissue_type must be stored in the database similar to other dataset metadata.
  2. tissue_type must be returned in the Discover and Data Portal filter API(s). MUST be refined.
  3. tissue_type must be added as a new Collection and Dataset UX Filter. MUST be refined.

Also see the conversation with @jahilton.

Data Viz

  1. tissue_type must be displayed in Standard Categories or dataset drawer in Single Cell Explorer.
  2. This impacts any special processing or filtering related to "organoids" or "cell culture" in tissue_ontology_term_id and tissue.
  3. The WMG processing pipeline should be updated to exclude "non-tissue" tissues from the generated cubes.

Census

  1. This impacts any special processing or filtering related to "organoids" or "cell culture" in tissue_ontology_term_id and tissue.
  2. tissue_type must be added to the Census schema.
  3. tissue_type must be added to the Census builder.
prathapsridharan commented 12 months ago

@atarashansky @signechambers1 (cc: @dsadgat @joyceyan )

This is in regards to the data-viz part of the requirement in the description:

  1. If tissue_type is to be added to explorer, should it show up under the Standard Categories on the left side pane? Currently the list show is [cell type, development stage, donor id, sex, tissue] - _Should tissue_type be an addition to this list in explorer?_
  2. If so, we would need to make changes the to the filtering logic too. I think that might be a fair bit of work if you we consider the timeline for schema 4 (Nov 4) and also having to test this. Could this particular change to explorer be done after schema 4 migration? The reason being that since it is a brand new field, this particular change to the data schema is backward compatible with the reader of the data (explorer)
signechambers1 commented 12 months ago

@prathapsridharan in response to your questions:

  1. Yes, if tissue_type has multiple values it should show up under Standard Categories. If it is a singleton value across the entire dataset I would expect it to show up in the dataset drawer.
  2. What filtering logic are you speaking of? Surfacing the value can happen after the schema 4 migration.