BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
295 stars 112 forks source link

GDCprepare() returns Error in function (classes, fdef, mtable) #478

Closed mattpedone closed 2 years ago

mattpedone commented 2 years ago

Hi, I have downloaded TCGA-LGG data of Proteome Profiling. Then I have the following error when running GDCprepare:

library("TCGAbiolinks")
query_lgg = GDCquery(
  project = "TCGA-LGG",
  data.category = "Proteome Profiling",
  sample.type = "Primary Tumor", 
  legacy = FALSE)
#> --------------------------------------
#> o GDCquery: Searching in GDC database
#> --------------------------------------
#> Genome of reference: hg38
#> --------------------------------------------
#> oo Accessing GDC. This might take a while...
#> --------------------------------------------
#> ooo Project: TCGA-LGG
#> --------------------
#> oo Filtering results
#> --------------------
#> ooo By sample.type
#> ----------------
#> oo Checking data
#> ----------------
#> ooo Check if there are duplicated cases
#> ooo Check if there results for the query
#> -------------------
#> o Preparing output
#> -------------------

lgg_res <- getResults(query_lgg) 

GDCdownload(query = query_lgg)
#> Downloading data for project TCGA-LGG
#> Of the 429 files for download 429 already exist.
#> All samples have been already downloaded

lgg_data <- GDCprepare(query_lgg)
#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'metadata<-' for signature '"function"'

Created on 2021-11-10 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> ─ Session info ────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.0 (2021-05-18) #> os Ubuntu 20.04.2 LTS #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate it_IT.UTF-8 #> ctype it_IT.UTF-8 #> tz Europe/Rome #> date 2021-11-10 #> pandoc 2.11.4 @ /usr/lib/rstudio/bin/pandoc/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> AnnotationDbi 1.54.1 2021-06-08 [1] Bioconductor #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) #> backports 1.3.0 2021-10-27 [1] CRAN (R 4.1.0) #> Biobase 2.52.0 2021-05-19 [1] Bioconductor #> BiocFileCache 2.0.0 2021-05-19 [1] Bioconductor #> BiocGenerics 0.38.0 2021-05-19 [1] Bioconductor #> biomaRt 2.48.3 2021-08-15 [1] Bioconductor #> Biostrings 2.60.2 2021-08-05 [1] Bioconductor #> bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0) #> bitops 1.0-7 2021-04-24 [1] CRAN (R 4.1.0) #> blob 1.2.2 2021-07-23 [1] CRAN (R 4.1.0) #> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) #> cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.0) #> colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) #> crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0) #> curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.0) #> data.table 1.14.2 2021-09-27 [1] CRAN (R 4.1.0) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) #> dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) #> DelayedArray 0.18.0 2021-05-19 [1] Bioconductor #> digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.0) #> downloader 0.4 2015-07-09 [1] CRAN (R 4.1.0) #> dplyr 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) #> filelock 1.0.2 2018-10-05 [1] CRAN (R 4.1.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) #> generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0) #> GenomeInfoDb 1.28.4 2021-09-05 [1] Bioconductor #> GenomeInfoDbData 1.2.6 2021-11-10 [1] Bioconductor #> GenomicRanges 1.44.0 2021-05-19 [1] Bioconductor #> ggplot2 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) #> glue 1.5.0 2021-11-07 [1] CRAN (R 4.1.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) #> IRanges 2.26.0 2021-05-19 [1] Bioconductor #> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) #> KEGGREST 1.32.0 2021-05-19 [1] Bioconductor #> knitr 1.36 2021-09-29 [1] CRAN (R 4.1.0) #> lattice 0.20-44 2021-05-02 [4] CRAN (R 4.1.0) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) #> Matrix 1.3-4 2021-06-01 [4] CRAN (R 4.1.0) #> MatrixGenerics 1.4.3 2021-08-26 [1] Bioconductor #> matrixStats 0.61.0 2021-09-17 [1] CRAN (R 4.1.0) #> memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) #> pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) #> plyr 1.8.6 2020-03-03 [1] CRAN (R 4.1.0) #> png 0.1-7 2013-12-03 [1] CRAN (R 4.1.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) #> progress 1.2.2 2019-05-16 [1] CRAN (R 4.1.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) #> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.0) #> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.0) #> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.0) #> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) #> rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.1.0) #> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) #> RCurl 1.98-1.5 2021-09-17 [1] CRAN (R 4.1.0) #> readr 2.0.2 2021-09-27 [1] CRAN (R 4.1.0) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) #> rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0) #> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0) #> RSQLite 2.2.8 2021-08-21 [1] CRAN (R 4.1.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> rvest 1.0.2 2021-10-16 [1] CRAN (R 4.1.0) #> S4Vectors 0.30.2 2021-10-03 [1] Bioconductor #> scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) #> sessioninfo 1.2.1 2021-11-02 [1] CRAN (R 4.1.0) #> stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> styler 1.6.2 2021-09-23 [1] CRAN (R 4.1.0) #> SummarizedExperiment 1.22.0 2021-05-19 [1] Bioconductor #> TCGAbiolinks * 2.20.1 2021-10-07 [1] Bioconductor #> TCGAbiolinksGUI.data 1.12.0 2021-05-20 [1] Bioconductor #> tibble 3.1.6 2021-11-07 [1] CRAN (R 4.1.0) #> tidyr 1.1.4 2021-09-27 [1] CRAN (R 4.1.0) #> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) #> tzdb 0.2.0 2021-10-27 [1] CRAN (R 4.1.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) #> xfun 0.28 2021-11-04 [1] CRAN (R 4.1.0) #> XML 3.99-0.8 2021-09-17 [1] CRAN (R 4.1.0) #> xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) #> XVector 0.32.0 2021-05-19 [1] Bioconductor #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> zlibbioc 1.38.0 2021-05-19 [1] Bioconductor #> #> [1] /home/matt/R/x86_64-pc-linux-gnu-library/4.1 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

I tried to investigate the error and it lead me to Issue #198, but with no success.

Following the discussion therein I tried to select the data.type, but it seems that the data type "Protein Expression Quantification" is not valid:

library("TCGAbiolinks")

query_lgg = GDCquery(
  project = "TCGA-LGG",
  data.category = "Proteome Profiling",
  data.type = "Protein Expression Quantification",
  sample.type = "Primary Tumor", 
  legacy = FALSE)
#> --------------------------------------
#> o GDCquery: Searching in GDC database
#> --------------------------------------
#> Genome of reference: hg38
#> 
#> 
#> |sort(harmonized.data.type)          |
#> |:-----------------------------------|
#> |Aggregated Somatic Mutation         |
#> |Allele-specific Copy Number Segment |
#> |Annotated Somatic Mutation          |
#> |Biospecimen Supplement              |
#> |Clinical Supplement                 |
#> |Copy Number Segment                 |
#> |Gene Expression Quantification      |
#> |Gene Level Copy Number Scores       |
#> |Isoform Expression Quantification   |
#> |Masked Copy Number Segment          |
#> |Masked Somatic Mutation             |
#> |Masked Somatic Mutation             |
#> |Methylation Beta Value              |
#> |miRNA Expression Quantification     |
#> |Raw CGI Variant                     |
#> |Raw Simple Somatic Mutation         |
#> |Slide Image                         |
#> |Splice Junction Quantification      |
#> Error in checkDataTypeInput(legacy = legacy, data.type = data.type): Please set a data.type argument from the column harmonized.data.type above

Created on 2021-11-10 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> ─ Session info ────────────────────────────────────────────────────────────── #> setting value #> version R version 4.1.0 (2021-05-18) #> os Ubuntu 20.04.2 LTS #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate it_IT.UTF-8 #> ctype it_IT.UTF-8 #> tz Europe/Rome #> date 2021-11-10 #> pandoc 2.11.4 @ /usr/lib/rstudio/bin/pandoc/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> AnnotationDbi 1.54.1 2021-06-08 [1] Bioconductor #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) #> backports 1.3.0 2021-10-27 [1] CRAN (R 4.1.0) #> Biobase 2.52.0 2021-05-19 [1] Bioconductor #> BiocFileCache 2.0.0 2021-05-19 [1] Bioconductor #> BiocGenerics 0.38.0 2021-05-19 [1] Bioconductor #> biomaRt 2.48.3 2021-08-15 [1] Bioconductor #> Biostrings 2.60.2 2021-08-05 [1] Bioconductor #> bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0) #> bitops 1.0-7 2021-04-24 [1] CRAN (R 4.1.0) #> blob 1.2.2 2021-07-23 [1] CRAN (R 4.1.0) #> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) #> cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.0) #> colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) #> crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0) #> curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.0) #> data.table 1.14.2 2021-09-27 [1] CRAN (R 4.1.0) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) #> dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) #> DelayedArray 0.18.0 2021-05-19 [1] Bioconductor #> digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.0) #> downloader 0.4 2015-07-09 [1] CRAN (R 4.1.0) #> dplyr 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) #> filelock 1.0.2 2018-10-05 [1] CRAN (R 4.1.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) #> generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0) #> GenomeInfoDb 1.28.4 2021-09-05 [1] Bioconductor #> GenomeInfoDbData 1.2.6 2021-11-10 [1] Bioconductor #> GenomicRanges 1.44.0 2021-05-19 [1] Bioconductor #> ggplot2 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) #> glue 1.5.0 2021-11-07 [1] CRAN (R 4.1.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) #> IRanges 2.26.0 2021-05-19 [1] Bioconductor #> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) #> KEGGREST 1.32.0 2021-05-19 [1] Bioconductor #> knitr 1.36 2021-09-29 [1] CRAN (R 4.1.0) #> lattice 0.20-44 2021-05-02 [4] CRAN (R 4.1.0) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) #> Matrix 1.3-4 2021-06-01 [4] CRAN (R 4.1.0) #> MatrixGenerics 1.4.3 2021-08-26 [1] Bioconductor #> matrixStats 0.61.0 2021-09-17 [1] CRAN (R 4.1.0) #> memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) #> pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) #> plyr 1.8.6 2020-03-03 [1] CRAN (R 4.1.0) #> png 0.1-7 2013-12-03 [1] CRAN (R 4.1.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) #> progress 1.2.2 2019-05-16 [1] CRAN (R 4.1.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) #> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.0) #> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.0) #> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.0) #> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) #> rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.1.0) #> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) #> RCurl 1.98-1.5 2021-09-17 [1] CRAN (R 4.1.0) #> readr 2.0.2 2021-09-27 [1] CRAN (R 4.1.0) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) #> rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0) #> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0) #> RSQLite 2.2.8 2021-08-21 [1] CRAN (R 4.1.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> rvest 1.0.2 2021-10-16 [1] CRAN (R 4.1.0) #> S4Vectors 0.30.2 2021-10-03 [1] Bioconductor #> scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) #> sessioninfo 1.2.1 2021-11-02 [1] CRAN (R 4.1.0) #> stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> styler 1.6.2 2021-09-23 [1] CRAN (R 4.1.0) #> SummarizedExperiment 1.22.0 2021-05-19 [1] Bioconductor #> TCGAbiolinks * 2.20.1 2021-10-07 [1] Bioconductor #> TCGAbiolinksGUI.data 1.12.0 2021-05-20 [1] Bioconductor #> tibble 3.1.6 2021-11-07 [1] CRAN (R 4.1.0) #> tidyr 1.1.4 2021-09-27 [1] CRAN (R 4.1.0) #> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) #> tzdb 0.2.0 2021-10-27 [1] CRAN (R 4.1.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) #> xfun 0.28 2021-11-04 [1] CRAN (R 4.1.0) #> XML 3.99-0.8 2021-09-17 [1] CRAN (R 4.1.0) #> xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) #> XVector 0.32.0 2021-05-19 [1] Bioconductor #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> zlibbioc 1.38.0 2021-05-19 [1] Bioconductor #> #> [1] /home/matt/R/x86_64-pc-linux-gnu-library/4.1 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Nevertheless, it seems odd to me, since if I don't specify any data.type in the query, "Protein Expression Quantification" is the only data.type present in the data:

library("TCGAbiolinks")

query_lgg = GDCquery(
  project = "TCGA-LGG",
  data.category = "Proteome Profiling",
  sample.type = "Primary Tumor", 
  legacy = FALSE)
#> --------------------------------------
#> o GDCquery: Searching in GDC database
#> --------------------------------------
#> Genome of reference: hg38
#> --------------------------------------------
#> oo Accessing GDC. This might take a while...
#> --------------------------------------------
#> ooo Project: TCGA-LGG
#> --------------------
#> oo Filtering results
#> --------------------
#> ooo By sample.type
#> ----------------
#> oo Checking data
#> ----------------
#> ooo Check if there are duplicated cases
#> ooo Check if there results for the query
#> -------------------
#> o Preparing output
#> -------------------

lgg_res = getResults(query_lgg) 
table(lgg_res$data_type)
#> 
#> Protein Expression Quantification 
#>                               429

Created on 2021-11-10 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> ─ Session info ────────────────────────────────────────────────────────────── #> hash: family: man, man, girl, person facepalming: light skin tone, watch #> #> setting value #> version R version 4.1.0 (2021-05-18) #> os Ubuntu 20.04.2 LTS #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate it_IT.UTF-8 #> ctype it_IT.UTF-8 #> tz Europe/Rome #> date 2021-11-10 #> pandoc 2.11.4 @ /usr/lib/rstudio/bin/pandoc/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> AnnotationDbi 1.54.1 2021-06-08 [1] Bioconductor #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) #> backports 1.3.0 2021-10-27 [1] CRAN (R 4.1.0) #> Biobase 2.52.0 2021-05-19 [1] Bioconductor #> BiocFileCache 2.0.0 2021-05-19 [1] Bioconductor #> BiocGenerics 0.38.0 2021-05-19 [1] Bioconductor #> biomaRt 2.48.3 2021-08-15 [1] Bioconductor #> Biostrings 2.60.2 2021-08-05 [1] Bioconductor #> bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0) #> bitops 1.0-7 2021-04-24 [1] CRAN (R 4.1.0) #> blob 1.2.2 2021-07-23 [1] CRAN (R 4.1.0) #> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) #> cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.0) #> colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) #> crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0) #> curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.0) #> data.table 1.14.2 2021-09-27 [1] CRAN (R 4.1.0) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) #> dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) #> DelayedArray 0.18.0 2021-05-19 [1] Bioconductor #> digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.0) #> downloader 0.4 2015-07-09 [1] CRAN (R 4.1.0) #> dplyr 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) #> filelock 1.0.2 2018-10-05 [1] CRAN (R 4.1.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) #> generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0) #> GenomeInfoDb 1.28.4 2021-09-05 [1] Bioconductor #> GenomeInfoDbData 1.2.6 2021-11-10 [1] Bioconductor #> GenomicRanges 1.44.0 2021-05-19 [1] Bioconductor #> ggplot2 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) #> glue 1.5.0 2021-11-07 [1] CRAN (R 4.1.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) #> IRanges 2.26.0 2021-05-19 [1] Bioconductor #> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) #> KEGGREST 1.32.0 2021-05-19 [1] Bioconductor #> knitr 1.36 2021-09-29 [1] CRAN (R 4.1.0) #> lattice 0.20-44 2021-05-02 [4] CRAN (R 4.1.0) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) #> Matrix 1.3-4 2021-06-01 [4] CRAN (R 4.1.0) #> MatrixGenerics 1.4.3 2021-08-26 [1] Bioconductor #> matrixStats 0.61.0 2021-09-17 [1] CRAN (R 4.1.0) #> memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) #> pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) #> plyr 1.8.6 2020-03-03 [1] CRAN (R 4.1.0) #> png 0.1-7 2013-12-03 [1] CRAN (R 4.1.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) #> progress 1.2.2 2019-05-16 [1] CRAN (R 4.1.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) #> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.0) #> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.0) #> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.0) #> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) #> rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.1.0) #> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) #> RCurl 1.98-1.5 2021-09-17 [1] CRAN (R 4.1.0) #> readr 2.0.2 2021-09-27 [1] CRAN (R 4.1.0) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) #> rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0) #> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0) #> RSQLite 2.2.8 2021-08-21 [1] CRAN (R 4.1.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> rvest 1.0.2 2021-10-16 [1] CRAN (R 4.1.0) #> S4Vectors 0.30.2 2021-10-03 [1] Bioconductor #> scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) #> sessioninfo 1.2.1 2021-11-02 [1] CRAN (R 4.1.0) #> stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> styler 1.6.2 2021-09-23 [1] CRAN (R 4.1.0) #> SummarizedExperiment 1.22.0 2021-05-19 [1] Bioconductor #> TCGAbiolinks * 2.20.1 2021-10-07 [1] Bioconductor #> TCGAbiolinksGUI.data 1.12.0 2021-05-20 [1] Bioconductor #> tibble 3.1.6 2021-11-07 [1] CRAN (R 4.1.0) #> tidyr 1.1.4 2021-09-27 [1] CRAN (R 4.1.0) #> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) #> tzdb 0.2.0 2021-10-27 [1] CRAN (R 4.1.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) #> xfun 0.28 2021-11-04 [1] CRAN (R 4.1.0) #> XML 3.99-0.8 2021-09-17 [1] CRAN (R 4.1.0) #> xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) #> XVector 0.32.0 2021-05-19 [1] Bioconductor #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> zlibbioc 1.38.0 2021-05-19 [1] Bioconductor #> #> [1] /home/matt/R/x86_64-pc-linux-gnu-library/4.1 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Am I missing something? Thanks in advance, Matteo

tiagochst commented 2 years ago

@mattpedone I just added support for proteome profiling in the package. You should be able to install from Github with the following command BiocManager::install("BioinformaticsFMRP/TCGAbiolinks")

mattpedone commented 2 years ago

Thank you @tiagochst! it actually solved the problem.

Sorry to bother you again, but I get a similar error when I try to use colData.

library("TCGAbiolinks")
library("SummarizedExperiment")
#> Caricamento del pacchetto richiesto: MatrixGenerics
#> Caricamento del pacchetto richiesto: matrixStats
#> 
#> Caricamento pacchetto: 'MatrixGenerics'
#> I seguenti oggetti sono mascherati da 'package:matrixStats':
#> 
#>     colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#>     colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#>     colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#>     colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#>     colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#>     colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#>     colWeightedMeans, colWeightedMedians, colWeightedSds,
#>     colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#>     rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#>     rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#>     rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#>     rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#>     rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#>     rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#>     rowWeightedSds, rowWeightedVars
#> Caricamento del pacchetto richiesto: GenomicRanges
#> Caricamento del pacchetto richiesto: stats4
#> Caricamento del pacchetto richiesto: BiocGenerics
#> Caricamento del pacchetto richiesto: parallel
#> 
#> Caricamento pacchetto: 'BiocGenerics'
#> I seguenti oggetti sono mascherati da 'package:parallel':
#> 
#>     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
#>     clusterExport, clusterMap, parApply, parCapply, parLapply,
#>     parLapplyLB, parRapply, parSapply, parSapplyLB
#> I seguenti oggetti sono mascherati da 'package:stats':
#> 
#>     IQR, mad, sd, var, xtabs
#> I seguenti oggetti sono mascherati da 'package:base':
#> 
#>     anyDuplicated, append, as.data.frame, basename, cbind, colnames,
#>     dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
#>     grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
#>     order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
#>     rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
#>     union, unique, unsplit, which.max, which.min
#> Caricamento del pacchetto richiesto: S4Vectors
#> 
#> Caricamento pacchetto: 'S4Vectors'
#> I seguenti oggetti sono mascherati da 'package:base':
#> 
#>     expand.grid, I, unname
#> Caricamento del pacchetto richiesto: IRanges
#> Caricamento del pacchetto richiesto: GenomeInfoDb
#> Caricamento del pacchetto richiesto: Biobase
#> Welcome to Bioconductor
#> 
#>     Vignettes contain introductory material; view with
#>     'browseVignettes()'. To cite Bioconductor, see
#>     'citation("Biobase")', and for packages 'citation("pkgname")'.
#> 
#> Caricamento pacchetto: 'Biobase'
#> Il seguente oggetto è mascherato da 'package:MatrixGenerics':
#> 
#>     rowMedians
#> I seguenti oggetti sono mascherati da 'package:matrixStats':
#> 
#>     anyMissing, rowMedians

query_lgg = GDCquery(
  project = "TCGA-LGG",
  data.category = "Proteome Profiling",
  sample.type = "Primary Tumor", 
  legacy = FALSE)
#> --------------------------------------
#> o GDCquery: Searching in GDC database
#> --------------------------------------
#> Genome of reference: hg38
#> --------------------------------------------
#> oo Accessing GDC. This might take a while...
#> --------------------------------------------
#> ooo Project: TCGA-LGG
#> --------------------
#> oo Filtering results
#> --------------------
#> ooo By sample.type
#> ----------------
#> oo Checking data
#> ----------------
#> ooo Check if there are duplicated cases
#> ooo Check if there results for the query
#> -------------------
#> o Preparing output
#> -------------------

GDCdownload(query = query_lgg)
#> Downloading data for project TCGA-LGG
#> Of the 429 files for download 429 already exist.
#> All samples have been already downloaded

lgg_data = GDCprepare(query_lgg, summarizedExperiment = T)

str(lgg_data, list.len = 5)
#> tibble [487 × 434] (S3: tbl_df/tbl/data.frame)
#>  $ AGID            : chr [1:487] "AGID00100" "AGID00111" "AGID00101" "AGID00001" ...
#>  $ lab_id          : num [1:487] 882 913 883 2 3 6 8 985 13 14 ...
#>  $ catalog_number  : chr [1:487] "sc-628" "sc-23957" "sc-1019" "9452" ...
#>  $ set_id          : chr [1:487] "Old" "Old" "Old" "Old" ...
#>  $ peptide_target  : chr [1:487] "1433BETA" "1433EPSILON" "1433ZETA" "4EBP1" ...
#>   [list output truncated]

#colnames(colData(lgg_data))
colData(lgg_data)
#> Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'colData' for signature '"tbl_df"'

Created on 2021-11-15 by the reprex package (v2.0.1)

Session info ``` r sessioninfo::session_info() #> ─ Session info ────────────────────────────────────────────────────────────── #> hash: bank, closed umbrella, crossed fingers: dark skin tone #> #> setting value #> version R version 4.1.0 (2021-05-18) #> os Ubuntu 20.04.2 LTS #> system x86_64, linux-gnu #> ui X11 #> language (EN) #> collate it_IT.UTF-8 #> ctype it_IT.UTF-8 #> tz Europe/Rome #> date 2021-11-15 #> pandoc 2.11.4 @ /usr/lib/rstudio/bin/pandoc/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> AnnotationDbi 1.54.1 2021-06-08 [1] Bioconductor #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0) #> backports 1.3.0 2021-10-27 [1] CRAN (R 4.1.0) #> Biobase * 2.52.0 2021-05-19 [1] Bioconductor #> BiocFileCache 2.0.0 2021-05-19 [1] Bioconductor #> BiocGenerics * 0.38.0 2021-05-19 [1] Bioconductor #> biomaRt 2.48.3 2021-08-15 [1] Bioconductor #> Biostrings 2.60.2 2021-08-05 [1] Bioconductor #> bit 4.0.4 2020-08-04 [1] CRAN (R 4.1.0) #> bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.0) #> bitops 1.0-7 2021-04-24 [1] CRAN (R 4.1.0) #> blob 1.2.2 2021-07-23 [1] CRAN (R 4.1.0) #> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0) #> cli 3.1.0 2021-10-27 [1] CRAN (R 4.1.0) #> colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0) #> crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0) #> curl 4.3.2 2021-06-23 [1] CRAN (R 4.1.0) #> data.table 1.14.2 2021-09-27 [1] CRAN (R 4.1.0) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 4.1.0) #> dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0) #> DelayedArray 0.18.0 2021-05-19 [1] Bioconductor #> digest 0.6.28 2021-09-23 [1] CRAN (R 4.1.0) #> downloader 0.4 2015-07-09 [1] CRAN (R 4.1.0) #> dplyr 1.0.7 2021-06-18 [1] CRAN (R 4.1.0) #> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0) #> evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0) #> fansi 0.5.0 2021-05-25 [1] CRAN (R 4.1.0) #> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0) #> filelock 1.0.2 2018-10-05 [1] CRAN (R 4.1.0) #> fs 1.5.0 2020-07-31 [1] CRAN (R 4.1.0) #> generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0) #> GenomeInfoDb * 1.28.4 2021-09-05 [1] Bioconductor #> GenomeInfoDbData 1.2.6 2021-11-10 [1] Bioconductor #> GenomicRanges * 1.44.0 2021-05-19 [1] Bioconductor #> ggplot2 3.3.5 2021-06-25 [1] CRAN (R 4.1.0) #> glue 1.5.0 2021-11-07 [1] CRAN (R 4.1.0) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0) #> highr 0.9 2021-04-16 [1] CRAN (R 4.1.0) #> hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0) #> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0) #> httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0) #> IRanges * 2.26.0 2021-05-19 [1] Bioconductor #> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.1.0) #> KEGGREST 1.32.0 2021-05-19 [1] Bioconductor #> knitr 1.36 2021-09-29 [1] CRAN (R 4.1.0) #> lattice 0.20-44 2021-05-02 [4] CRAN (R 4.1.0) #> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0) #> Matrix 1.3-4 2021-06-01 [4] CRAN (R 4.1.0) #> MatrixGenerics * 1.4.3 2021-08-26 [1] Bioconductor #> matrixStats * 0.61.0 2021-09-17 [1] CRAN (R 4.1.0) #> memoise 2.0.0 2021-01-26 [1] CRAN (R 4.1.0) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0) #> pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0) #> plyr 1.8.6 2020-03-03 [1] CRAN (R 4.1.0) #> png 0.1-7 2013-12-03 [1] CRAN (R 4.1.0) #> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0) #> progress 1.2.2 2019-05-16 [1] CRAN (R 4.1.0) #> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.0) #> R.cache 0.15.0 2021-04-30 [1] CRAN (R 4.1.0) #> R.methodsS3 1.8.1 2020-08-26 [1] CRAN (R 4.1.0) #> R.oo 1.24.0 2020-08-26 [1] CRAN (R 4.1.0) #> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0) #> rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.1.0) #> Rcpp 1.0.7 2021-07-07 [1] CRAN (R 4.1.0) #> RCurl 1.98-1.5 2021-09-17 [1] CRAN (R 4.1.0) #> readr 2.1.0 2021-11-11 [1] CRAN (R 4.1.0) #> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0) #> rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0) #> rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0) #> RSQLite 2.2.8 2021-08-21 [1] CRAN (R 4.1.0) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0) #> rvest 1.0.2 2021-10-16 [1] CRAN (R 4.1.0) #> S4Vectors * 0.30.2 2021-10-03 [1] Bioconductor #> scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0) #> sessioninfo 1.2.1 2021-11-02 [1] CRAN (R 4.1.0) #> stringi 1.7.5 2021-10-04 [1] CRAN (R 4.1.0) #> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.0) #> styler 1.6.2 2021-09-23 [1] CRAN (R 4.1.0) #> SummarizedExperiment * 1.22.0 2021-05-19 [1] Bioconductor #> TCGAbiolinks * 2.23.1 2021-11-12 [1] Github (BioinformaticsFMRP/TCGAbiolinks@63cab46) #> TCGAbiolinksGUI.data 1.12.0 2021-05-20 [1] Bioconductor #> tibble 3.1.6 2021-11-07 [1] CRAN (R 4.1.0) #> tidyr 1.1.4 2021-09-27 [1] CRAN (R 4.1.0) #> tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0) #> tzdb 0.2.0 2021-10-27 [1] CRAN (R 4.1.0) #> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0) #> vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0) #> vroom 1.5.6 2021-11-10 [1] CRAN (R 4.1.0) #> withr 2.4.2 2021-04-18 [1] CRAN (R 4.1.0) #> xfun 0.28 2021-11-04 [1] CRAN (R 4.1.0) #> XML 3.99-0.8 2021-09-17 [1] CRAN (R 4.1.0) #> xml2 1.3.2 2020-04-23 [1] CRAN (R 4.1.0) #> XVector 0.32.0 2021-05-19 [1] Bioconductor #> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.1.0) #> zlibbioc 1.38.0 2021-05-19 [1] Bioconductor #> #> [1] /home/matt/R/x86_64-pc-linux-gnu-library/4.1 #> [2] /usr/local/lib/R/site-library #> [3] /usr/lib/R/site-library #> [4] /usr/lib/R/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

I think that the issue here is due to the fact that lgg_data is a tibble and not a SummarizedExperiment object (even if I set summarizedExperiment = T in the GDCprepare function).

I am using colData to access the clinical data associated with the samples. Is it the correct procedure? Are there any other ways to do that?

Thank you again! Matteo

tiagochst commented 2 years ago

You can access the samples metadata with the following function samples.metadata <- TCGAbiolinks:::colDataPrepare(colnames(lgg_data[,-c(1:5)]))

The issue is the summarizedExperiment needs a GRanges, I don't have that information for proteins.

divya22ag commented 1 year ago

Hi, I am using samples.metadata <- TCGAbiolinks:::colDataPrepare(colnames(lgg_data[,-c(1:5)]))

this above code @tiagochst but UNABLE to get the sample metadata information.

The errror- Starting to add information to samples => Add clinical information to samples Error in dplyr::bind_cols(): ! ..1 must be a vector, not a <dtplyr_step_call/dtplyr_step> object. Run rlang::last_error() to see where the error occurred.

Please suggest some any other way?