waldronlab / cBioPortalData

Integrate the cancer genomics portal, cBioPortal, using MultiAssayExperiment
https://waldronlab.io/cBioPortalData/
30 stars 13 forks source link

Error in retrieving mutation data #74

Closed twohigjd closed 11 months ago

twohigjd commented 11 months ago

When I try to retrieve mutation data using molecularData() I get 404 errors. Other molecular data profiles are able to be loaded fine, but mutations yield the same error.

library(cBioPortalData)
library(annotables)
ibrary(tidyverse)
# initiate API
cbio <- cBioPortal()
# retrieve clinical data
clin_data <- clinicalData(cbio, "prad_pik3r1_msk_2021") 
# retrieve molecular profile data
mol_data <- molecularProfiles(cbio, c("prad_pik3r1_msk_2021")) 
# list available molecular profiles
mol_data[["molecularProfileId"]]
[1] "prad_pik3r1_msk_2021_cna"                 "prad_pik3r1_msk_2021_mutations"           "prad_pik3r1_msk_2021_structural_variants"
# retrieve CNA data
cna_data <- molecularData(cbio,
                             molecularProfileIds = c("prad_pik3r1_msk_2021_cna"),
                             entrezGeneIds = c("5925", "7157", "5728"),
                             sampleIds = clin_data$sampleId)

# retrieve mutation data
mutation_data <- molecularData(cbio,
                               molecularProfileIds = c("prad_pik3r1_msk_2021_mutations"),
                               entrezGeneIds = c("5925", "7157", "5728"),
                               sampleIds = clin_data$sampleId)
Error in .invoke_fun(api, name, use_cache, ...) : Not Found (HTTP 404).

I've tried with other studies but get the same error.

> mol_data[["molecularProfileId"]]
[1] "prad_p1000_cna"                 "prad_p1000_mutations"           "prad_p1000_structural_variants"
> cna_data <- molecularData(cbio,
+                              molecularProfileIds = c("prad_p1000_cna"),
+                              entrezGeneIds = c("5925", "7157", "5728"),
+                              sampleIds = clin_data$sampleId)
> 
> # retrieve mutation data
> mutation_data <- molecularData(cbio,
+                                molecularProfileIds = c("prad_p1000_mutations"),
+                                entrezGeneIds = c("5925", "7157", "5728"),
+                                sampleIds = clin_data$sampleId)
Error in .invoke_fun(api, name, use_cache, ...) : Not Found (HTTP 404).

Thanks!

LiNk-NY commented 11 months ago

Hi @twohigjd

Mutation data is on a different endpoint so you should use the mutationData function.

I always recommend trying the simple interface first, which would look like:

cBioPortalData(
  api = cbio, genes = c("5925", "7157", "5728"), by = "entrezGeneId", 
  studyId = "prad_pik3r1_msk_2021", 
  molecularProfileIds = c(
    "prad_pik3r1_msk_2021_cna", "prad_pik3r1_msk_2021_mutations", "prad_pik3r1_msk_2021_structural_variants"
  )
)

and returns :

A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes.
 Containing an ExperimentList class object of length 2:
 [1] prad_pik3r1_msk_2021_mutations: RangedSummarizedExperiment with 375 rows and 501 columns
 [2] prad_pik3r1_msk_2021_cna: SummarizedExperiment with 3 rows and 1417 columns
Functionality:
 experiments() - obtain the ExperimentList instance
 colData() - the primary/phenotype DataFrame
 sampleMap() - the sample coordination DataFrame
 `$`, `[`, `[[` - extract colData columns, subset, or experiment
 *Format() - convert into a long or wide DataFrame
 assays() - convert ExperimentList to a SimpleList of matrices
 exportClass() - save data to flat files

You can check the metadata for data that was not able to be mapped to the patient identifiers.

Best, Marcel

twohigjd commented 11 months ago

Okay, TYSM!