waldronlab / cBioPortalData

Integrate the cancer genomics portal, cBioPortal, using MultiAssayExperiment
30 stars 12 forks source link

Error in retrieving mutation data #74

Closed twohigjd closed 9 months ago

twohigjd commented 9 months ago

When I try to retrieve mutation data using molecularData() I get 404 errors. Other molecular data profiles are able to be loaded fine, but mutations yield the same error.

# initiate API
cbio <- cBioPortal()
# retrieve clinical data
clin_data <- clinicalData(cbio, "prad_pik3r1_msk_2021") 
# retrieve molecular profile data
mol_data <- molecularProfiles(cbio, c("prad_pik3r1_msk_2021")) 
# list available molecular profiles
[1] "prad_pik3r1_msk_2021_cna"                 "prad_pik3r1_msk_2021_mutations"           "prad_pik3r1_msk_2021_structural_variants"
# retrieve CNA data
cna_data <- molecularData(cbio,
                             molecularProfileIds = c("prad_pik3r1_msk_2021_cna"),
                             entrezGeneIds = c("5925", "7157", "5728"),
                             sampleIds = clin_data$sampleId)

# retrieve mutation data
mutation_data <- molecularData(cbio,
                               molecularProfileIds = c("prad_pik3r1_msk_2021_mutations"),
                               entrezGeneIds = c("5925", "7157", "5728"),
                               sampleIds = clin_data$sampleId)
Error in .invoke_fun(api, name, use_cache, ...) : Not Found (HTTP 404).

I've tried with other studies but get the same error.

> mol_data[["molecularProfileId"]]
[1] "prad_p1000_cna"                 "prad_p1000_mutations"           "prad_p1000_structural_variants"
> cna_data <- molecularData(cbio,
+                              molecularProfileIds = c("prad_p1000_cna"),
+                              entrezGeneIds = c("5925", "7157", "5728"),
+                              sampleIds = clin_data$sampleId)
> # retrieve mutation data
> mutation_data <- molecularData(cbio,
+                                molecularProfileIds = c("prad_p1000_mutations"),
+                                entrezGeneIds = c("5925", "7157", "5728"),
+                                sampleIds = clin_data$sampleId)
Error in .invoke_fun(api, name, use_cache, ...) : Not Found (HTTP 404).


LiNk-NY commented 9 months ago

Hi @twohigjd

Mutation data is on a different endpoint so you should use the mutationData function.

I always recommend trying the simple interface first, which would look like:

  api = cbio, genes = c("5925", "7157", "5728"), by = "entrezGeneId", 
  studyId = "prad_pik3r1_msk_2021", 
  molecularProfileIds = c(
    "prad_pik3r1_msk_2021_cna", "prad_pik3r1_msk_2021_mutations", "prad_pik3r1_msk_2021_structural_variants"

and returns :

A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes.
 Containing an ExperimentList class object of length 2:
 [1] prad_pik3r1_msk_2021_mutations: RangedSummarizedExperiment with 375 rows and 501 columns
 [2] prad_pik3r1_msk_2021_cna: SummarizedExperiment with 3 rows and 1417 columns
 experiments() - obtain the ExperimentList instance
 colData() - the primary/phenotype DataFrame
 sampleMap() - the sample coordination DataFrame
 `$`, `[`, `[[` - extract colData columns, subset, or experiment
 *Format() - convert into a long or wide DataFrame
 assays() - convert ExperimentList to a SimpleList of matrices
 exportClass() - save data to flat files

You can check the metadata for data that was not able to be mapped to the patient identifiers.

Best, Marcel

twohigjd commented 9 months ago

Okay, TYSM!