Open pnrobinson opened 10 months ago
If available, should we add say both "Stage IV" and "Differentiated Thyroid Gland Carcinoma 55 Years and Older AJCC v8 Stage" to indicate Stage IV according to this AJCC staging system?
Yes, I think this is the way to go if possible - sort of compose the correct stage info using general info ("Stage IV") and more detailed stage info if available and parsable (eg. "Differentiated Thyroid Gland Carcinoma....")
Obviously one challenge here is figure out a coherent way to map to all the various detailed stages in NCIT
The NCI folks can probably help and advice us, and also we have some GPT tooling that might be useful, e.g. https://github.com/monarch-initiative/gpt-mapping-manuscript
@mbrush and others looked at this issue quite extensively in the context of modeling for CCDH. You can see some of that information at https://cancerdhc.github.io/ccdhmodel/v1.1/CancerStageObservationSet/ Maybe Matt has some insight.
also look at the slides in the deck starting here: https://docs.google.com/presentation/d/1B1Lc_Qn8F97P9T5hk-lS_i073wkL8f-bZYCem30fs48/edit#slide=id.g214bfceb3ba_0_2877
I don’t know if this is useful to us, but this demo notebook has some examples converting GDC data to the CCDH data model. Possibly some useful examples here. See the create_stage_from_gdc function.
https://github.com/cancerDHC/example-data/blob/main/GDC%20to%20CCDH%20conversion.ipynb @pnrobinson @justaddcoffee
CDA seems to have general terms such as Stage IV.
NCIT also has more detailed stages, e.g. Differentiated Thyroid Gland Carcinoma 55 Years and Older AJCC v8 Stage. We can add an unlimited list of ontology terms to the disease object in GA4GH phenopackets (https://phenopacket-schema.readthedocs.io/en/latest/disease.html).
If available, should we add say both "Stage IV" and "Differentiated Thyroid Gland Carcinoma 55 Years and Older AJCC v8 Stage" to indicate Stage IV according to this AJCC staging system?