monarch-initiative / oncoexporter

Cancer data to GA4GH phenopacket
https://monarch-initiative.github.io/oncoexporter
MIT License
6 stars 1 forks source link

How to model Cancer stage #37

Open pnrobinson opened 10 months ago

pnrobinson commented 10 months ago

CDA seems to have general terms such as Stage IV.

NCIT also has more detailed stages, e.g. Differentiated Thyroid Gland Carcinoma 55 Years and Older AJCC v8 Stage. We can add an unlimited list of ontology terms to the disease object in GA4GH phenopackets (https://phenopacket-schema.readthedocs.io/en/latest/disease.html).

If available, should we add say both "Stage IV" and "Differentiated Thyroid Gland Carcinoma 55 Years and Older AJCC v8 Stage" to indicate Stage IV according to this AJCC staging system?

justaddcoffee commented 10 months ago

If available, should we add say both "Stage IV" and "Differentiated Thyroid Gland Carcinoma 55 Years and Older AJCC v8 Stage" to indicate Stage IV according to this AJCC staging system?

Yes, I think this is the way to go if possible - sort of compose the correct stage info using general info ("Stage IV") and more detailed stage info if available and parsable (eg. "Differentiated Thyroid Gland Carcinoma....")

Obviously one challenge here is figure out a coherent way to map to all the various detailed stages in NCIT

The NCI folks can probably help and advice us, and also we have some GPT tooling that might be useful, e.g. https://github.com/monarch-initiative/gpt-mapping-manuscript

monicacecilia commented 10 months ago

@mbrush and others looked at this issue quite extensively in the context of modeling for CCDH. You can see some of that information at https://cancerdhc.github.io/ccdhmodel/v1.1/CancerStageObservationSet/ Maybe Matt has some insight.

mbrush commented 10 months ago

also look at the slides in the deck starting here: https://docs.google.com/presentation/d/1B1Lc_Qn8F97P9T5hk-lS_i073wkL8f-bZYCem30fs48/edit#slide=id.g214bfceb3ba_0_2877

msierk commented 10 months ago

I don’t know if this is useful to us, but this demo notebook has some examples converting GDC data to the CCDH data model. Possibly some useful examples here. See the create_stage_from_gdc function.

https://github.com/cancerDHC/example-data/blob/main/GDC%20to%20CCDH%20conversion.ipynb @pnrobinson @justaddcoffee