BioinformaticsFMRP / TCGAbiolinks

TCGAbiolinks
http://bioconductor.org/packages/devel/bioc/vignettes/TCGAbiolinks/inst/doc/index.html
298 stars 112 forks source link

Project specific error in GDCprepare() #623

Closed DijkJel closed 8 months ago

DijkJel commented 8 months ago

Hi,

I would like to download and prepare the MAF for several TCGA projects, but run into issues in the GDCpepare() step. This code:

library(TCGAbiolinks)
tcga_project = 'TCGA-COAD'

coad_maf = TCGAbiolinks::GDCquery(
  project = tcga_project,
  data.category = "Simple Nucleotide Variation",
  access = "open",
  data.type = "Masked Somatic Mutation",
  workflow.type = "Aliquot Ensemble Somatic Variant Merging and Masking"
)

TCGAbiolinks::GDCdownload(coad_maf, directory = paste0(directory, '/GDCdata'))

maf = TCGAbiolinks::GDCprepare(coad_maf, directory = paste0(directory, '/GDCdata'))

works when trying with TCGA-COAD, but returns the following error when using TCGA-PRAD:

Error in `dplyr::bind_rows()`:                                                                                                                                                                                              
! Can't combine `..10$Tumor_Seq_Allele2` <character> and `..11$Tumor_Seq_Allele2` <logical>.
Run `rlang::last_trace()` to see where the error occurred.

Is there a way to solve this?

Best, Jelmer

R version 4.3.2 (2023-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TCGAbiolinks_2.30.0

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1            dplyr_1.1.4                 blob_1.2.4                  filelock_1.0.3              Biostrings_2.70.1           bitops_1.0-7                fastmap_1.1.1              
 [8] RCurl_1.98-1.13             BiocFileCache_2.10.1        XML_3.99-0.16               digest_0.6.33               lifecycle_1.0.4             KEGGREST_1.42.0             RSQLite_2.3.4              
[15] magrittr_2.0.3              compiler_4.3.2              rlang_1.1.2                 progress_1.2.3              tools_4.3.2                 utf8_1.2.4                  data.table_1.14.10         
[22] knitr_1.45                  prettyunits_1.2.0           S4Arrays_1.2.0              bit_4.0.5                   curl_5.2.0                  DelayedArray_0.28.0         plyr_1.8.9                 
[29] xml2_1.3.6                  abind_1.4-5                 withr_3.0.0                 purrr_1.0.2                 BiocGenerics_0.48.1         grid_4.3.2                  stats4_4.3.2               
[36] fansi_1.0.6                 colorspace_2.1-0            ggplot2_3.5.0               scales_1.3.0                biomaRt_2.58.2              SummarizedExperiment_1.32.0 cli_3.6.2                  
[43] crayon_1.5.2                generics_0.1.3              rstudioapi_0.15.0           httr_1.4.7                  tzdb_0.4.0                  DBI_1.2.2                   cachem_1.0.8               
[50] stringr_1.5.1               zlibbioc_1.48.0             parallel_4.3.2              rvest_1.0.4                 AnnotationDbi_1.64.1        TCGAbiolinksGUI.data_1.22.0 XVector_0.42.0             
[57] matrixStats_1.2.0           vctrs_0.6.5                 Matrix_1.6-4                jsonlite_1.8.8              IRanges_2.36.0              hms_1.1.3                   S4Vectors_0.40.2           
[64] bit64_4.0.5                 tidyr_1.3.0                 glue_1.6.2                  stringi_1.8.3               gtable_0.3.4                GenomeInfoDb_1.38.8         GenomicRanges_1.54.1       
[71] munsell_0.5.0               tibble_3.2.1                pillar_1.9.0                rappdirs_0.3.3              GenomeInfoDbData_1.2.11     R6_2.5.1                    dbplyr_2.5.0               
[78] vroom_1.6.5                 lattice_0.22-5              Biobase_2.62.0              readr_2.1.4                 png_0.1-8                   memoise_2.0.1               Rcpp_1.0.11                
[85] SparseArray_1.2.2           downloader_0.4              xfun_0.41                   MatrixGenerics_1.14.0       pkgconfig_2.0.3            
tiagochst commented 8 months ago

Hi,

It is working on the latest version 2.31.2. Please could you update from github ? Using the following command: "BiocManager::install("BioinformaticsFMRP/TCGAbiolinks")"

Best regards, Tiago Chedraoui Silva

On Fri, Mar 22, 2024 at 8:57 AM DijkJel @.***> wrote:

Hi,

I would like to download and prepare the MAF for several TCGA projects, but run into issues in the GDCpepare() step. This code:

library(TCGAbiolinks) tcga_project = 'TCGA-COAD'

coad_maf = TCGAbiolinks::GDCquery( project = tcga_project, data.category = "Simple Nucleotide Variation", access = "open", data.type = "Masked Somatic Mutation", workflow.type = "Aliquot Ensemble Somatic Variant Merging and Masking" )

TCGAbiolinks::GDCdownload(coad_maf, directory = paste0(directory, '/GDCdata'))

maf = TCGAbiolinks::GDCprepare(coad_maf, directory = paste0(directory, '/GDCdata'))

works when trying with TCGA-COAD, but returns the following error when using TCGA-PRAD:

Error in dplyr::bind_rows(): ! Can't combine ..10$Tumor_Seq_Allele2 and ..11$Tumor_Seq_Allele2 . Run rlang::last_trace() to see where the error occurred.

Is there a way to solve this?

Best, Jelmer

R version 4.3.2 (2023-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19045)

Matrix products: default

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] TCGAbiolinks_2.30.0

loaded via a namespace (and not attached): [1] tidyselect_1.2.1 dplyr_1.1.4 blob_1.2.4 filelock_1.0.3 Biostrings_2.70.1 bitops_1.0-7 fastmap_1.1.1 [8] RCurl_1.98-1.13 BiocFileCache_2.10.1 XML_3.99-0.16 digest_0.6.33 lifecycle_1.0.4 KEGGREST_1.42.0 RSQLite_2.3.4 [15] magrittr_2.0.3 compiler_4.3.2 rlang_1.1.2 progress_1.2.3 tools_4.3.2 utf8_1.2.4 data.table_1.14.10 [22] knitr_1.45 prettyunits_1.2.0 S4Arrays_1.2.0 bit_4.0.5 curl_5.2.0 DelayedArray_0.28.0 plyr_1.8.9 [29] xml2_1.3.6 abind_1.4-5 withr_3.0.0 purrr_1.0.2 BiocGenerics_0.48.1 grid_4.3.2 stats4_4.3.2 [36] fansi_1.0.6 colorspace_2.1-0 ggplot2_3.5.0 scales_1.3.0 biomaRt_2.58.2 SummarizedExperiment_1.32.0 cli_3.6.2 [43] crayon_1.5.2 generics_0.1.3 rstudioapi_0.15.0 httr_1.4.7 tzdb_0.4.0 DBI_1.2.2 cachem_1.0.8 [50] stringr_1.5.1 zlibbioc_1.48.0 parallel_4.3.2 rvest_1.0.4 AnnotationDbi_1.64.1 TCGAbiolinksGUI.data_1.22.0 XVector_0.42.0 [57] matrixStats_1.2.0 vctrs_0.6.5 Matrix_1.6-4 jsonlite_1.8.8 IRanges_2.36.0 hms_1.1.3 S4Vectors_0.40.2 [64] bit64_4.0.5 tidyr_1.3.0 glue_1.6.2 stringi_1.8.3 gtable_0.3.4 GenomeInfoDb_1.38.8 GenomicRanges_1.54.1 [71] munsell_0.5.0 tibble_3.2.1 pillar_1.9.0 rappdirs_0.3.3 GenomeInfoDbData_1.2.11 R6_2.5.1 dbplyr_2.5.0 [78] vroom_1.6.5 lattice_0.22-5 Biobase_2.62.0 readr_2.1.4 png_0.1-8 memoise_2.0.1 Rcpp_1.0.11 [85] SparseArray_1.2.2 downloader_0.4 xfun_0.41 MatrixGenerics_1.14.0 pkgconfig_2.0.3

— Reply to this email directly, view it on GitHub https://github.com/BioinformaticsFMRP/TCGAbiolinks/issues/623, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQ6OSBRTWQNV4SKNEK73YZQTCBAVCNFSM6AAAAABFDI7KM2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYDENBTHE2TSMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

DijkJel commented 8 months ago

Great, that solved it. Thanks a lot!