Bioconductor / GenomicDataCommons

Provide R access to the NCI Genomic Data Commons portal.
http://bioconductor.github.io/GenomicDataCommons/
83 stars 23 forks source link

about filter Metastatic sample #112

Closed yinchaoyi closed 10 months ago

yinchaoyi commented 10 months ago

response <- cases() %>% filter(~ project.project_id=='TCGA-BRCA' & samples.sample_type=='Metastatic') %>% GenomicDataCommons::select(c(default_fields(cases()), 'samples.sample_type')) %>% response_all() When I use the code, I only get 7 samples(eg,TCGA-BH-A1FE). For example, the clinical information (xml file) of TCGA-BH-A1FE does not have the site of metastasis.

LiNk-NY commented 10 months ago

Hi @yinchaoyi

When filtering for "metastatic", I only see 7 samples listed in the Genomic Data Commons data portal so this number seems correct. You can check the diagnoses table in the clinical data after getting the IDs of the 7 patients.

library(GenomicDataCommons)
ids <- cases() |> 
  filter(~ project.project_id == "TCGA-BRCA" & samples.sample_type == "metastatic") |> 
  ids()
gdc_clinical(case_ids = ids)

If you have more questions about using the software, please ask at support.bioconductor.org.

Best, Marcel