Closed jharenza closed 3 years ago
Is this still a to-do ticket or the base histology already takes care of this? Base histology file seems to already have the "Updated CNS Region" terms assigned.
The direct matches in the ticket wouldn't satisfy where primary_site values can be assigned to multiple Updated_CNS_regions
, these are all "Other" in base histology.
| Cerebellum/Posterior Fossa;Temporal Lobe;Thalamus | Optic Pathway;Temporal Lobe | Optic Pathway;Suprasellar/Hypothalamic/Pituitary;Temporal Lobe | Cerebellum/Posterior Fossa;Spinal Cord- Thoracic | Basal Ganglia;Optic Pathway;Temporal Lobe;Ventricles | Parietal Lobe;Ventricles | Skull;Temporal Lobe | Cerebellum/Posterior Fossa;Spinal Cord- Cervical;Spinal Cord- Lumbar/Thecal Sac;Spinal Cord- Thoracic | Cerebellum/Posterior Fossa;Frontal Lobe | Brain Stem- Pons;Cerebellum/Posterior Fossa
For these, let's annotate as "Mixed", knowing they need pathology review from Cassie for a primary site region designation.
Is this still a to-do ticket or the base histology already takes care of this? Base histology file seems to already have the "Updated CNS Region" terms assigned.
Converting all of the extraction of data using DBT will mean that CNS_region
will no longer be generated in the base histology file from the D3b end. However, we can add this, as you suggest, to our base histology preparation following QC.
Closing, as this is now a part of the intermediate D3b workflow prior to release of pbta-histologies-base.tsv
per this comment.
@kgaonkar6 I am going to re-open this, as I think the logic was not captured entirely as anticipated for #849. For instance, some samples which should be mixed are annotated as midline or hemispheric. We can update this in #v19.
One other thing to note is that for our "Mixed" samples, @jainpayal022 and Cassie Kline will be manually reviewing pathology and imaging reports of about 70 or so HGAT samples to assess the primary site of origin. We will then add these manually curated values for those CNS_regions. See this ticket.
Adding the code here for CNS_region matching here https://github.com/d3b-center/D3b-codes/blob/master/OpenPBTA_v19_release_QC/QC_histology_v19.Rmd to record
Just wanted to add the primary site changes from the latest pulls, so guide updates to the above primary_site ~ CNS_region matches
Kids_First_Biospecimen_ID | primary_site_latest | primary_site_previous |
---|---|---|
BS_1Q524P3B | L. Pons Anterior | Pons/Brainstem |
BS_22VCR7DF | L. Lateral Pons | Pons/Brainstem |
BS_5968GBGT | R. Posterior Pons; Adjacent #6 | Pons/Brainstem |
BS_AF5D41PD | L. Frontal Periventricular White Matter; Adjacent #3 | Pons/Brainstem |
BS_AK9BV52G | Cerebellar White Matter Adjacent Necrosis + Medulla | Pons/Brainstem |
BS_D6STCMQS | L. Anterior Medulla | Pons/Brainstem |
BS_EE73VE7V | R. Inferior Pons | Pons/Brainstem |
BS_HYKV2TH9 | R. Anterior Pons; Adjacent #7 | Pons/Brainstem |
BS_J8EH1N7V | Inferior Pons | Pons/Brainstem |
BS_J8EK6RNF | Brain Stem-Medulla;Brain Stem- Midbrain/Tectum;Brain Stem- Pons;Cerebellum/Posterior Fossa;Thalamus | Pons/Brainstem |
BS_X5VN0FW0 | Inferior Medulla | Pons/Brainstem |
BS_Y74XAFJX | Superior Pons | Pons/Brainstem |
BS_YHXMYDBN | L. Pons | Pons/Brainstem |
I'm not sure where exactly we will be using CNS_regions so just wondering if we have an updated dictionary matching primary_site to CNS_region to incorporate the new primary_sites assigned to PNOC autopsy samples? Currently the CNS_region is being assigned as NA since it doesn't match any terms in the issue description.
Thank you for checking on this @kgaonkar6 - I had the answers from Cassie, but never posted here. See below with the only one in question being Cerebellar White Matter Adjacent Necrosis + Medulla
, so for now, we can designate as Mixed
, unless there is other data available to confirm Midline:
Kids_First_Biospecimen_ID | primary_site_latest | primary_site_previous | CNS_region |
---|---|---|---|
BS_1Q524P3B | L. Pons Anterior | Pons/Brainstem | Midline |
BS_22VCR7DF | L. Lateral Pons | Pons/Brainstem | Midline |
BS_5968GBGT | R. Posterior Pons; Adjacent #6 | Pons/Brainstem | Midline |
BS_AF5D41PD | L. Frontal Periventricular White Matter; Adjacent #3 | Pons/Brainstem | Mixed |
BS_AK9BV52G | Cerebellar White Matter Adjacent Necrosis + Medulla | Pons/Brainstem | Mixed |
BS_D6STCMQS | L. Anterior Medulla | Pons/Brainstem | Midline |
BS_EE73VE7V | R. Inferior Pons | Pons/Brainstem | Midline |
BS_HYKV2TH9 | R. Anterior Pons; Adjacent #7 | Pons/Brainstem | Midline |
BS_J8EH1N7V | Inferior Pons | Pons/Brainstem | Midline |
BS_J8EK6RNF | Brain Stem-Medulla;Brain Stem- Midbrain/Tectum;Brain Stem- Pons;Cerebellum/Posterior Fossa;Thalamus | Pons/Brainstem | Midline |
BS_X5VN0FW0 | Inferior Medulla | Pons/Brainstem | Midline |
BS_Y74XAFJX | Superior Pons | Pons/Brainstem | Midline |
BS_YHXMYDBN | L. Pons | Pons/Brainstem | Midline |
Thank you for the update!
Confirmed this will remain Mixed
Cerebellar White Matter Adjacent Necrosis + Medulla, so for now, we can designate as Mixed
Previously the first step of CNS_region assignment is "Mixed" is there are multiple values separated by ";" and then check if all individual values are part of on CNS_region. So the primary_site like R. Posterior Pons; Adjacent #6
,R. Anterior Pons; Adjacent #7
and Brain Stem-Medulla;Brain Stem- Midbrain/Tectum;Brain Stem- Pons;Cerebellum/Posterior Fossa;Thalamus
will be assigned Mixed.
https://github.com/d3b-center/D3b-codes/blob/20210212-release/OpenPBTA_v20_release_QC/code/util/primary_site_matched_CNS_region.R do we want to update the code or make these exeptions since "Adjacent" is not a brain location per se?
To be honest, I was thinking about this and think that whoever ingests this into the BRP should match these regions with CBTN terms so we don't always have to update the terms. Let me check in with Jenn Mason and Shannon Robbins.
In the meantime. I was envisioning these going into the JSON file for Midline.
Just need a confirmation
BS_J8EK6RNF seems to have primary_site "Pons" in the latest histology file pulls ( since 20210126-data release ) instead of the "Brain Stem-Medulla;Brain Stem- Midbrain/Tectum;Brain Stem- Pons;Cerebellum/Posterior Fossa;Thalamus" in the table above do we need to confirm this @jharenza ?
Just need a confirmation
BS_J8EK6RNF seems to have primary_site "Pons" in the latest histology file pulls ( since 20210126-data release ) instead of the "Brain Stem-Medulla;Brain Stem- Midbrain/Tectum;Brain Stem- Pons;Cerebellum/Posterior Fossa;Thalamus" in the table above do we need to confirm this @jharenza ?
Let me check on this. Are the other samples matching primary_site_latest
?
yes
Ok, it looks like Brain Stem-Medulla;Brain Stem- Midbrain/Tectum;Brain Stem- Pons;Cerebellum/Posterior Fossa;Thalamus
was never a value in @jainpayal022's excel sheet for kids first import, but rather, Pons
, so waiting for her to confirm this longer string was a mistake along the way somehow.
confirmed that this value should be Pons
we added this upstream during QC
What are the scientific goals of the analysis?
Add CNS_region to pbta-histologies.tsv using logic utilized on CHOP D3b end
What methods do you plan to use to accomplish the scientific goals?
Regions: Hemispheric, Midline, Spine, Ventricles, Posterior fossa, Optic pathway, Suprasellar, Other
Reference notion doc from @baileyckelly
What input data are required for this analysis?
pbta-histologies.tsv
How long do you expect is needed to complete the analysis? Will it be a multi-step analysis?
0.5 day
Who will complete the analysis (please add a GitHub handle here if relevant)?
@kgaonkar6
What relevant scientific literature relates to this analysis?
NA, informed by physicians: Angela Waanders and Cassie Kline