Closed jharenza closed 2 years ago
Ok, I perhaps need to rerun the LGAT subtyping module as well.
@jaclyn-taroni I need some help. I tried rerunning molecular subyping for LGAT, but I am running into errors. First, in ce8dbd2, I am updating the 01 script. It would kill at the rbind step for consensus and hotspot mafs, so I reordered the code to pull LGAT samples out of these files upon reading so that they aren't so big. That worked, and 03 is now giving an error at chunk 7, when making the TxDb from GTF for FGFR1. I saw some perhaps related tickets suggesting this may be due to unstable RefSeq files? I am not sure what to do here.
closing this and will start fresh once some of the code updates are merged.
Purpose/implementation Section
What scientific question is your analysis addressing?
This updates the histologies file with MB WGS samples as "To be classified", which were previously missed
What was your approach?
MB, To be classified
instead ofTo be classified
LGG, subtype
-->SEGA, subtype
base-histologies.tsv
from v21 and reranmolecular-subtype-integrate
to getpbta-histologies.tsv
.What GitHub issue does your pull request address?
1207
Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.
Which areas should receive a particularly close look?
Is there anything that you want to discuss further?
Notes:
pbta-histology-base.tsv
changes daily, and when rerunning using what I thought might be the base @kgaonkar6 used, I realized this module was recently rerun in this PR by @runjin326, perhaps using a more recent version of the file. Therefore, it looks like there are many more diffs than there really should be here.pbta-histologies.tsv
minus the columns forharmonized_diagnosis
andcancer_group
. I put this bit of code at the very top of the script, but I think we may want to comment it out before merge? Similarly, I also added some QC into this, but maybe it is fine because this is the last(?) version of the histologies file for OpenPBTA?broad_histology
with the code we had in place (also in01-integrate-subtyping.nb.html
:The idea behind this separation into cancer groups before was to visualize the smaller groups within the oncoprint. The main takeaway, though, is that because there were a handful of pilocytic, and pleomorphic (pxa) not in the
Low-grade glioma astrocytoma
cancer_group
_and there were some SEGA in theLow-grade glioma astrocytoma
cancer_group
, the analyses are not performed on the exact cohort of interest, so this is not an easy fix by simply recoding the v22cancer_group
back to v21. I also realized thatGanglioglioma
is already its own cancer group and has a high enough N, so is in many plots already, but was missed the survivalLGG_group
.I suppose my thoughts from all of this are that if we have to remake figures anyway, it probably makes sense to keep the cancer group code as it was added by @kgaonkar6, we may have to make a few more colors in the palette, and update survival to use the relevant cancer groups within LGG. 😭
Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?
no, but we need to discuss next steps
Results
What types of results are included (e.g., table, figure)?
What is your summary of the results?
Reproducibility Checklist
Documentation Checklist
README
and it is up to date.analyses/README.md
and the entry is up to date.