Planned new vignette do document expected discrepancies between the cBio versions of genomic data vs genomic data on Synapse
Data structure returned for CNA and fusions
By default, cBioPortal filters out Silent, Intron, IGR, 3'UTR, 5'UTR, 3'Flank and 5'Flank, except for the promoter mutations of the TERT gene.
From Ritika: I further investigated the MSK cases for TCF7L2. There is an isoform difference between what GENIE is using Vs what we are using at MSK. We use MSK isoform for clinical IMPACT and that is reporting the TCF7L2 as a protein coding variant whereas GENIE using the uniport isoform, this variant is an intron. Since you are using NSCLC public GENIE, I would use that isoform.
Regarding RARA issue, the file/DFCI is using gene symbol CTD-2267D19.2 but the entrez ID is mapped to RARA in the MAF (5914). Portal will give preference to entrez ID over hugo symbol on import. And by looking at the coordinates and chr location, its all pointing to RARA.
Planned new vignette do document expected discrepancies between the cBio versions of genomic data vs genomic data on Synapse
Regarding RARA issue, the file/DFCI is using gene symbol CTD-2267D19.2 but the entrez ID is mapped to RARA in the MAF (5914). Portal will give preference to entrez ID over hugo symbol on import. And by looking at the coordinates and chr location, its all pointing to RARA.
For TERT, those are TERT promoter mutations. Those are an exception to the filter rule as they are important biomarkers in multiple cancers. You will notice in the portal, they don’t have an amino acid change but the tag Promoter (https://genie-private.cbioportal.org/patient?studyId=nsclc_public_genie_bpc&caseId=GENIE-DFCI-034904#navCaseIds=nsclc_public_genie_bpc:GENIE-DFCI-034904)