Closed runjin326 closed 2 years ago
@migbro let's also update the description as below:
Current:
[Pediatric Open Targets (V10)](https://kf-strides-cbioportal-qa.kidsfirstdrc.org/study?id=ped_opentargets_2021)
Pediatric Open Targets is a collaborative project between the National Cancer Institute and the Children's Hospital of Philadelphia. Through this project and as part of the NCI's Childhood Cancer Data Initiative, we are utilizing the harmonization work of the KidsFirst Data Resource Center and analytics work of OpenPBTA to build a pediatric preclinical pediatric platform to assist in development and query of the FDA's Relevant Molecular Targets List to identify new therapeutics for children with cancer. For updates, please see here: [Release Notes](https://tinyurl.com/55cxz9am)
New:
[Open Pediatric Cancer (OpenPedCan) Project](https://kf-strides-cbioportal-qa.kidsfirstdrc.org/study?id=ped_opentargets_2021)
[OpenPedCan](https://github.com/PediatricOpenTargets/OpenPedCan-analysis) is a collaborative project between the National Cancer Institute and the Children's Hospital of Philadelphia as part of the NCI's Childhood Cancer Data Initiative. Here, we harmonize pan-cancer data using [KidsFirst Data Resource Center](https://kidsfirstdrc.org/) workflows and harness [OpenPBTA](https://github.com/AlexsLemonade/OpenPBTA-analysis) analytics workflows to scale and add modules across pediatric cancer datasets. This data has been integrated into the pediatric open targets platform to assist in development and query of the FDA's Relevant Pediatric Molecular Targets List (PMTL) to identify new therapeutics for children with cancer. For study release details, please see [Release Notes](https://tinyurl.com/55cxz9am).
Let's also change the study name from ped_opentargets_2021
to open_ped_can
Within your notion page, we should capture all details of the repo release, data release, etc. Just another FYI as well - @taylordm said we can only add the portal link once it is live, so we will have to do another update later.
Spotted by @adamcresnick - fusion genes are not both displaying at patient level, and as such, oncoKB designation is missing.
OpenPedCan: https://kf-strides-cbioportal-qa.kidsfirstdrc.org/patient?studyId=ped_opentargets_2021&caseId=PT_GQZ84ACS
This is also happening across the board with our studies. OpenPBTA: https://pedcbioportal.kidsfirstdrc.org/patient?studyId=openpbta&caseId=PT_1J2DT6MM
This is not happening in cbio: https://www.cbioportal.org/patient?sampleId=P-0001453-T01-IM3&studyId=blca_nmibc_2017
or with PPTC: https://pedcbioportal.kidsfirstdrc.org/patient?studyId=pptc&caseId=P0163
Seems like --
issue if pedcbio is parsing the fusion name using the hyphen...?
Ok, so I tested out using only a single hyphen on a smaller KF project. Before (using prod): https://pedcbioportal.kidsfirstdrc.org/patient?sampleId=PAUMTZ-09A-01&studyId=aml_sd_pet7q6f2_2018 After (on QA): https://kf-strides-cbioportal-qa.kidsfirstdrc.org/patient?sampleId=PAUMTZ-09A-01&studyId=aml_sd_pet7q6f2_2018 Seems to be an improvement, but I wonder if the repeat entries hack to get both genes to be searchable has made things weird. I'll try removing that...
Actually, another issue is if a gene symbol is not in there database, there is a chance the fusion might be skipped, so like AC022145.2-MLLT10
. In the file it is:
AC022145.2 Fred Hutchinson Cancer Research Center PAUMTZ-09A-01 AC022145.2-MLLT10 no yes ARRIBA other
MLLT10 Fred Hutchinson Cancer Research Center PAUMTZ-09A-01 AC022145.2-MLLT10 no yes ARRIBA other
but when you look on QA how it loaded, only the MLLT10 line was used, and the gene name AC022145.2
ignored. So, perhaps the repeat lines are ok, but depending on how much time I have, this might be the best I can do.
Just a comment as an update. Aside from the hyphen issue, which we will adopt the single -
separator to fix the display, the hack mentioned above seems useful to keep fusions that involve a gene and an intergenic region. It's an "ancient" problem that we can try and tackle better in the near future, but this is at least an improvement.
https://kf-strides-cbioportal-qa.kidsfirstdrc.org/study/summary?id=ped_opentargets_2021 - looks good, sending to prod!
closing since these specific tasks have been completed
What data file(s) does this issue pertain to?
PedCBio v10 dataload
What release are you using?
v10
Put your question or report your issue here.
v10 load related issues:
OpenPedCan update after v10 is loaded to pedcbio:
Re-run SNV frequencies module to annotate the table with new cohort case IDs: https://github.com/PediatricOpenTargets/ticket-tracker/issues/230- we will do this after v10Additional issues:
Investigate SNV frequencies mismatch between pedCbio and MTP: https://github.com/PediatricOpenTargets/ticket-tracker/issues/284- we will do this after v10Note that the other tickets that showed up after searching PedCBio are not directly related to the pedcbio load - either they are documentation updates, or module enhancement.