Cellular-Semantics / CL_KG

Building a Cell Ontology Knowledge-Base from data, and LLMs
Apache License 2.0
0 stars 0 forks source link

Marker management repo for CL and CL-KG #14

Open dosumis opened 3 months ago

dosumis commented 3 months ago

We will generate a new Cell Marker ODK repo to manage markers for CL & CL-KG : - cellmark

The first aim for this repo will be to generate NS-Forest markers following standard patterns developed for BDSO (see also https://github.com/obophenotype/cell-ontology/pull/2439).

MVP:

Challenges:

hkir-dev commented 3 months ago

New repo created: https://github.com/Cellular-Semantics/CellMark (development ongoing)

LungMAP is using ensemble genes: image

Lung cell atlas as well: image

But in the template we have ncbi genes, in the DOSDP template https://github.com/Cellular-Semantics/CellMark/blob/main/src/markers/NSForestMarkersSource.tsv

Should we use ensemble genes in the DOSDP template as well? Who can help me for picking the correct genes?

dosumis commented 3 months ago

What did we get from Renne? She should be able to provide a mapping if she is providing NCBI gene IDs.

hkir-dev commented 3 months ago

I'm uncertain about the origin of these minimal markers. I believed that one of our curators had manually extracted them.

dosumis commented 3 months ago

They come from Renne's analysis. I will dig out the emails. One thing we need for the marker repo is a standard place to put files like this.

dosumis commented 3 months ago

Adding links and file here for now.

Renne's Zenodo pub - notebooks - this gives us a DOI to ref for analysis.

Highlights of this analysis are in HLCA_CellRef_MarkerPerformance_forDOS.xlsx

An older HLCA NS Forest analysis can be found here - exactMatch2CL_definitionAdditions.xlsx + related ticket ( https://github.com/obophenotype/cell-ontology/issues/2313) - but I believe this is superseded by analysis in Renne's pub above.

hkir-dev commented 3 months ago

Source data is at https://github.com/Cellular-Semantics/CellMark/blob/main/src/markers/NSForestMarkersSource.tsv