waldronlab / curatedTCGAData

Curated Data From The Cancer Genome Atlas (TCGA) as MultiAssayExperiment Objects
https://bioconductor.org/packages/curatedTCGAData
41 stars 7 forks source link

Mutation data on BLCA study [Question] #58

Closed bblodfon closed 1 year ago

bblodfon commented 1 year ago

Hi,

See below code snippet. I see 130 patients which have mutation data in the BLCA study if I interpret the returned object correctly. Looking at other sources, like cbioportal or GDC, I would think that almost all patients in that study should have that data type. Is that a true discrepancy and if so, do you know the reason behind it?

library(curatedTCGAData)

cancer_data = curatedTCGAData::curatedTCGAData(diseaseCode = 'BLCA',
  assays = '*Mutation*', version = '2.0.1', dry.run = FALSE)
#> snapshotDate(): 2022-10-31
#> Working on: BLCA_Mutation-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> require("RaggedExperiment")
#> Working on: BLCA_colData-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Working on: BLCA_metadata-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> Working on: BLCA_sampleMap-20160128
#> see ?curatedTCGAData and browseVignettes('curatedTCGAData') for documentation
#> loading from cache
#> harmonizing input:
#>   removing 5187 sampleMap rows not in names(experiments)
#>   removing 282 colData rownames not in sampleMap 'primary'
print(cancer_data)
#> A MultiAssayExperiment object of 1 listed
#>  experiment with a user-defined name and respective class.
#>  Containing an ExperimentList class object of length 1:
#>  [1] BLCA_Mutation-20160128: RaggedExperiment with 39312 rows and 130 columns
#> Functionality:
#>  experiments() - obtain the ExperimentList instance
#>  colData() - the primary/phenotype DataFrame
#>  sampleMap() - the sample coordination DataFrame
#>  `$`, `[`, `[[` - extract colData columns, subset, or experiment
#>  *Format() - convert into a long or wide DataFrame
#>  assays() - convert ExperimentList to a SimpleList of matrices
#>  exportClass() - save data to flat files

Created on 2023-06-29 with reprex v2.0.2

LiNk-NY commented 1 year ago

These are not the same data. We obtain the Mutation data from here : http://gdac.broadinstitute.org/runs/stddata__2016_01_28/data/BLCA/20160128/gdac.broadinstitute.org_BLCA.Mutation_Packager_Calls.Level_3.2016012800.0.0.tar.gz If you'd like to download cBioPortal data, use cBioPortalData: https://bioconductor.org/packages/cBioPortalData or the GenomicDataCommons https://bioconductor.org/packages/GenomicDataCommons

bblodfon commented 1 year ago

Thanks Marcel!