chanzuckerberg / single-cell-curation

Code and documentation for the curation of cellxgene datasets
MIT License
37 stars 23 forks source link

Assess impact of updating HsapDv and MmusDv #401

Closed brianraymor closed 4 months ago

brianraymor commented 1 year ago

Design

In Progress

See Review impact of HsapDv and MMusDV updates to development stage filter

Historical Context

We must assess the impact of the three years of ontologyevolution for the schema (and subsequent re-curation) and CELLxGENE experiences such as the Development Stages filters in the Discover UX.

September 7 2023: There is now a release tag shared across all development stage ontologies. See https://github.com/obophenotype/developmental-stage-ontologies/releases/tag/v2023-08-18.

July 15 2023: Moving to Schema 5 due to a lack of progress in tagged HsapDv and MmusDv releases.

HsapDv and MmusDv are being updated, but there have been no official releases since 2020.

CELLxGENE dataset schema requires:

Ontology OBO Prefix Release Download
Human Developmental Stages HsapDv 2020-03-10 hsapdv.owl
Mouse Developmental Stages MmusDv 2020-03-10 mmusdv.owl

2020-03-10 is the same version available on EBI OLS.

@jahilton noted a recent response to a Lattice inquiry indicating that a valid term in the 2020 release is now obsolete.

There are ongoing updates in https://github.com/obophenotype/developmental-stage-ontologies/tree/master/src/hsapdv but it does not seem to follow the regular release cadence of other ontologies.


jahilton commented 1 year ago

Reached out to Frédéric Bastian & got a quick response...

jahilton commented 1 year ago

Looking at just the HsapDv updates - did a comparison here Some deprecated terms have terms available with slightly larger ranges, making them easy migrations, but other deprecated terms would need to be reinvestigated or they'd be migrated to a much larger range.

brianraymor commented 8 months ago

Related note on UX filtering hierarchy from @jahilton:

While I’m looking at it, I’ll even toss out a proposal for filters after the bump… (taking in to account the deprecated terms and several current terms with 0s in our data for very granular terms)

jahilton commented 7 months ago

We have started organizing the data in the corpus that are using now-deprecated terms On the lattice/dev-ont-migration branch there are 2 files:

When the time comes, migrate.py should be able to read in the metadata from these Will update here once we have addressed existing metadata

jahilton commented 7 months ago

We have assessed all existing metadata in CELLxGENE - private & public and annotated the donors with deprecated dev_stage terms to updated dev_stage terms. We'll soon be pushing those mappings to the files in the branch noted above. We have also added a step in the curation QA to check for newly submitted donors that will need similar mapping.

brianraymor commented 4 months ago

Incorporated into #884 for a cleaner copy.