GENCODE - Mus musculus - release M28 not loading ok #65

Closed AMChalkie closed 1 year ago

AMChalkie commented 2 years ago


Thanks for the very useful tool. I'm failing to get M28 to behave as expected.

Best wishes Alistair

se <- tximeta::tximeta(sample_information.df)

importing quantifications reading in files with read_tsv 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 found matching transcriptome: [ GENCODE - Mus musculus - release M28 ] useHub=TRUE: checking for TxDb via 'AnnotationHub' snapshotDate(): 2022-04-21 did not find matching TxDb via 'AnnotationHub' building TxDb with 'GenomicFeatures' package Import genomic features from the file as a GRanges object ... Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'download.file' for signature '"character"'

mikelove commented 2 years ago

hi, I can't reproduce -- I just quantified against M28 and imported:

> devtools::load_all("tximeta")
ℹ Loading tximeta
> coldata <- data.frame(files="sample/quant.sf",names="sample")
> se <- tximeta(coldata)
importing quantifications
reading in files with read_tsv
found matching transcriptome:
[ GENCODE - Mus musculus - release M28 ]
useHub=TRUE: checking for TxDb via 'AnnotationHub'
  |======================================================================| 100%

snapshotDate(): 2021-10-20
did not find matching TxDb via 'AnnotationHub'
building TxDb with 'GenomicFeatures' package
Import genomic features from the file as a GRanges object ... trying URL ''
Content type 'unknown' length 28349951 bytes (27.0 MB)
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
generating transcript ranges
fetching genome info for GENCODE

Warning messages:
1: In .get_cds_IDX(mcols0$type, mcols0$phase) :
  The "phase" metadata column contains non-NA values for features of type
  stop_codon. This information was ignored.
Calls: tximeta ... makeTxDbFromGFF -> makeTxDbFromGRanges -> .get_cds_IDX
2: In valid.GenomicRanges.seqinfo(x, suggest.trim = TRUE) :
  GRanges object contains 87 out-of-bound ranges located on sequences
  chr4, chr8, chr13, chr14, and chr17. Note that ranges located on a
  sequence whose length is unknown (NA) or on a circular sequence are not
  considered out-of-bound (use seqlengths() and isCircular() to get the
  lengths and circularity flags of the underlying sequences). You can use
  trim() to trim these ranges. See ?`trim,GenomicRanges-method` for more
Calls: tximeta ... seqinfo<- -> seqinfo<- -> valid.GenomicRanges.seqinfo
> rowRanges(se)
GRanges object with 140790 ranges and 3 metadata columns:
                       seqnames          ranges strand |     tx_id
                          <Rle>       <IRanges>  <Rle> | <integer>
  ENSMUST00000193812.2     chr1 3143476-3144545      + |         1
  ENSMUST00000082908.3     chr1 3172239-3172348      + |         2
  ENSMUST00000162897.2     chr1 3276124-3286567      - |      4218
  ENSMUST00000159265.2     chr1 3276746-3285855      - |      4219
  ENSMUST00000070533.5     chr1 3284705-3741721      - |      4220
                   ...      ...             ...    ... .       ...
  ENSMUST00000082419.1     chrM     13552-14070      - |    142374
  ENSMUST00000082420.1     chrM     14071-14139      - |    142375
  ENSMUST00000082421.1     chrM     14145-15288      + |    142366
  ENSMUST00000082422.1     chrM     15289-15355      + |    142367
  ENSMUST00000082423.1     chrM     15356-15422      - |    142376
                                    gene_id              tx_name
                            <CharacterList>          <character>
  ENSMUST00000193812.2 ENSMUSG00000102693.2 ENSMUST00000193812.2
  ENSMUST00000082908.3 ENSMUSG00000064842.3 ENSMUST00000082908.3
  ENSMUST00000162897.2 ENSMUSG00000051951.6 ENSMUST00000162897.2
  ENSMUST00000159265.2 ENSMUSG00000051951.6 ENSMUST00000159265.2
  ENSMUST00000070533.5 ENSMUSG00000051951.6 ENSMUST00000070533.5
                   ...                  ...                  ...
  ENSMUST00000082419.1 ENSMUSG00000064368.1 ENSMUST00000082419.1
  ENSMUST00000082420.1 ENSMUSG00000064369.1 ENSMUST00000082420.1
  ENSMUST00000082421.1 ENSMUSG00000064370.1 ENSMUST00000082421.1
  ENSMUST00000082422.1 ENSMUSG00000064371.1 ENSMUST00000082422.1
  ENSMUST00000082423.1 ENSMUSG00000064372.1 ENSMUST00000082423.1
  seqinfo: 22 sequences (1 circular) from mm10 genome

Can you try on a different machine, or maybe try with latest version of R/Bioconductor?

AMChalkie commented 2 years ago

I will check another machine. In the meantime I get this warning when loading tximeta that looks related to the download error. And have included more debug info.

library(tximeta) Warning message: replacing previous import ‘utils::download.file’ by ‘restfulr::download.file’ when loading ‘rtracklayer’

coldata <- data.frame(files="JY1a/quant.sf",names="JY1a")

se <- tximeta(coldata) importing quantifications reading in files with read_tsv 1 found matching transcriptome: [ GENCODE - Mus musculus - release M28 ] useHub=TRUE: checking for TxDb via 'AnnotationHub' snapshotDate(): 2022-04-21 did not find matching TxDb via 'AnnotationHub' building TxDb with 'GenomicFeatures' package Import genomic features from the file as a GRanges object ... Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'download.file' for signature '"character"'

Traceback gives this:

11: stop(gettextf("unable to find an inherited method for function %s for signature %s", sQuote(fdef@generic), sQuote(cnames)), domain = NA) 10: (function (classes, fdef, mtable) { methods <- .findInheritedMethods(classes, fdef, mtable) if (length(methods) == 1L) return(methods[[1L]]) else if (length(methods) == 0L) { cnames <- paste0("\"", vapply(classes, as.character, ""), "\"", collapse = ", ") stop(gettextf("unable to find an inherited method for function %s for signature %s", sQuote(fdef@generic), sQuote(cnames)), domain = NA) } else stop("Internal error in finding inherited methods; didn't return a unique method", domain = NA) })(list("character"), new("standardGeneric", .Data = function (url, destfile, method, quiet = FALSE, mode = "w", cacheOK = TRUE, extra = getOption("download.file.extra")) standardGeneric("download.file"), generic = structure("download.file", package = "restfulr"), package = "restfulr", group = list(), valueClass = character(0), signature = "url", default = NULL, skeleton = (function (url, destfile, method, quiet = FALSE, mode = "w", cacheOK = TRUE, extra = getOption("download.file.extra")) stop(gettextf("invalid call in method dispatch to '%s' (no default method)", "download.file"), domain = NA))(url, destfile, method, quiet, mode, cacheOK, extra)), ) 9: download.file(resource(con), destfile) 8: .local(con, format, text, ...) 7: import(FileForFormat(con, format), ...) 6: import(FileForFormat(con, format), ...) 5: import(file, format = format, colnames = colnames, feature.type = GFF_FEATURE_TYPES) 4: import(file, format = format, colnames = colnames, feature.type = GFF_FEATURE_TYPES) 3: makeTxDbFromGFF(txomeInfo$gtf) 2: getTxDb(txomeInfo, useHub = useHub) 1: tximeta(coldata)

sessionInfo() R version 4.2.0 (2022-04-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Monterey 12.3.1

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] GGally_2.1.2 ggrepel_0.9.1 plotly_4.10.0 tidyHeatmap_1.8.1 forcats_0.5.1
[6] stringr_1.4.0 dplyr_1.0.9 purrr_0.3.4 readr_2.1.2 tidyr_1.2.0
[11] tibble_3.1.7 tidyverse_1.3.1 ggplot2_3.3.6 tidySummarizedExperiment_1.6.1 tidybulk_1.8.0
[16] tximport_1.23.4 data.table_1.14.2 SummarizedExperiment_1.26.1 Biobase_2.56.0 GenomicRanges_1.48.0
[21] GenomeInfoDb_1.32.2 IRanges_2.30.0 S4Vectors_0.34.0 BiocGenerics_0.42.0 MatrixGenerics_1.8.0
[26] matrixStats_0.62.0 tximeta_1.14.0

loaded via a namespace (and not attached): [1] readxl_1.4.0 backports_1.4.1 circlize_0.4.15 AnnotationHub_3.4.0 BiocFileCache_2.4.0
[6] plyr_1.8.7 lazyeval_0.2.2 BiocParallel_1.30.3 usethis_2.1.6 digest_0.6.29
[11] foreach_1.5.2 ensembldb_2.20.1 htmltools_0.5.2 viridis_0.6.2 fansi_1.0.3
[16] magrittr_2.0.3 memoise_2.0.1 cluster_2.1.3 doParallel_1.0.17 remotes_2.4.2
[21] tzdb_0.3.0 ComplexHeatmap_2.12.0 Biostrings_2.64.0 modelr_0.1.8 vroom_1.5.7
[26] prettyunits_1.1.1 colorspace_2.0-3 rvest_1.0.2 blob_1.2.3 rappdirs_0.3.3
[31] xfun_0.31 haven_2.5.0 callr_3.7.0 crayon_1.5.1 RCurl_1.98-1.7
[36] jsonlite_1.8.0 iterators_1.0.14 glue_1.6.2 gtable_0.3.0 zlibbioc_1.42.0
[41] XVector_0.36.0 GetoptLong_1.0.5 DelayedArray_0.22.0 pkgbuild_1.3.1 shape_1.4.6
[46] scales_1.2.0 DBI_1.1.2 Rcpp_1.0.8.3 viridisLite_0.4.0 xtable_1.8-4
[51] progress_1.2.2 clue_0.3-61 bit_4.0.4 preprocessCore_1.58.0 htmlwidgets_1.5.4
[56] httr_1.4.3 RColorBrewer_1.1-3 ellipsis_0.3.2 reshape_0.8.9 pkgconfig_2.0.3
[61] XML_3.99-0.10 dbplyr_2.2.0 utf8_1.2.2 tidyselect_1.1.2 rlang_1.0.2
[66] later_1.3.0 AnnotationDbi_1.58.0 cellranger_1.1.0 munsell_0.5.0 BiocVersion_3.15.2
[71] tools_4.2.0 cachem_1.0.6 cli_3.3.0 generics_0.1.2 RSQLite_2.2.14
[76] devtools_2.4.3 broom_0.8.0 evaluate_0.15 fastmap_1.1.0 yaml_2.3.5
[81] processx_3.6.0 knitr_1.39 fs_1.5.2 bit64_4.0.5 KEGGREST_1.36.2
[86] AnnotationFilter_1.20.0 dendextend_1.15.2 mime_0.12 xml2_1.3.3 biomaRt_2.52.0
[91] brio_1.1.3 compiler_4.2.0 rstudioapi_0.13 filelock_1.0.2 curl_4.3.2
[96] png_0.1-7 interactiveDisplayBase_1.34.0 testthat_3.1.4 reprex_2.0.1 stringi_1.7.6
[101] ps_1.7.0 desc_1.4.1 GenomicFeatures_1.48.3 lattice_0.20-45 ProtGenerics_1.28.0
[106] Matrix_1.4-1 vctrs_0.4.1 pillar_1.7.0 lifecycle_1.0.1 BiocManager_1.30.18
[111] GlobalOptions_0.1.2 bitops_1.0-7 httpuv_1.6.5 patchwork_1.1.1 rtracklayer_1.56.0
[116] R6_2.5.1 BiocIO_1.6.0 promises_1.2.0.1 gridExtra_2.3 sessioninfo_1.2.2
[121] codetools_0.2-18 pkgload_1.2.4 assertthat_0.2.1 rprojroot_2.0.3 rjson_0.2.21
[126] withr_2.5.0 GenomicAlignments_1.32.0 Rsamtools_2.12.0 GenomeInfoDbData_1.2.8 parallel_4.2.0
[131] hms_1.1.1 grid_4.2.0 rmarkdown_2.14 shiny_1.7.1 lubridate_1.8.0
[136] restfulr_0.0.14

mikelove commented 2 years ago

I think we can debug just within GenomicFeatures. This is the line causing trouble:

txdb <- makeTxDbFromGFF(txomeInfo$gtf)

where that first argument is equal to:

See if that also gives an error, maybe in a clean R session.

AMChalkie commented 2 years ago

Now we're getting somewhere

GenomicFeatures gives the warning

library(GenomicFeatures) Warning message: replacing previous import ‘utils::download.file’ by ‘restfulr::download.file’ when loading ‘rtracklayer’

txdb <- makeTxDbFromGFF("") # Import genomic features from the file as a GRanges object ... Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘download.file’ for signature ‘"character"’

AMChalkie commented 2 years ago

Same holds for dev version of bioconductor and latest GenomicFeatures.


txdb <- makeTxDbFromGFF("") Import genomic features from the file as a GRanges object ... Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘download.file’ for signature ‘"character"’

restfulr looks like the problem.

restfulr::download.file("") Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘download.file’ for signature ‘"character"’ download.file("") Error in download.file("") : argument "destfile" is missing, with no default download.file("",destfile="tmp.gtf") trying URL '' Content type 'unknown' length 28349951 bytes (27.0 MB)

sessionInfo() R version 4.2.0 (2022-04-22) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur/Monterey 10.16

Matrix products: default BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods
[8] base

other attached packages: [1] GenomicFeatures_1.49.5 AnnotationDbi_1.59.1 Biobase_2.57.1
[4] GenomicRanges_1.49.0 GenomeInfoDb_1.33.3 IRanges_2.31.0
[7] S4Vectors_0.35.1 BiocGenerics_0.43.0

loaded via a namespace (and not attached): [1] Rcpp_1.0.8.3 lattice_0.20-45
[3] prettyunits_1.1.1 png_0.1-7
[5] Rsamtools_2.13.3 Biostrings_2.65.1
[7] assertthat_0.2.1 digest_0.6.29
[9] utf8_1.2.2 BiocFileCache_2.5.0
[11] R6_2.5.1 RSQLite_2.2.14
[13] httr_1.4.3 pillar_1.7.0
[15] zlibbioc_1.43.0 rlang_1.0.2
[17] progress_1.2.2 curl_4.3.2
[19] blob_1.2.3 Matrix_1.4-1
[21] BiocParallel_1.31.8 stringr_1.4.0
[23] RCurl_1.98-1.7 bit_4.0.4
[25] biomaRt_2.53.2 DelayedArray_0.23.0
[27] compiler_4.2.0 rtracklayer_1.57.0
[29] pkgconfig_2.0.3 SummarizedExperiment_1.27.1 [31] tidyselect_1.1.2 KEGGREST_1.37.2
[33] tibble_3.1.7 GenomeInfoDbData_1.2.8
[35] matrixStats_0.62.0 codetools_0.2-18
[37] XML_3.99-0.10 fansi_1.0.3
[39] crayon_1.5.1 dplyr_1.0.9
[41] dbplyr_2.2.0 GenomicAlignments_1.33.0
[43] bitops_1.0-7 rappdirs_0.3.3
[45] grid_4.2.0 lifecycle_1.0.1
[47] DBI_1.1.2 magrittr_2.0.3
[49] cli_3.3.0 stringi_1.7.6
[51] cachem_1.0.6 XVector_0.37.0
[53] xml2_1.3.3 ellipsis_0.3.2
[55] filelock_1.0.2 generics_0.1.2
[57] vctrs_0.4.1 rjson_0.2.21
[59] restfulr_0.0.14 tools_4.2.0
[61] bit64_4.0.5 glue_1.6.2
[63] purrr_0.3.4 MatrixGenerics_1.9.0
[65] hms_1.1.1 parallel_4.2.0
[67] fastmap_1.1.0 yaml_2.3.5
[69] BiocManager_1.30.18 memoise_2.0.1
[71] BiocIO_1.7.1

mikelove commented 2 years ago

Then if a core package won’t import a standard GTF file you can post to support site, but first you’d want to make sure you have a valid Bioc installation.


AMChalkie commented 2 years ago

BiocManager::valid() [1] TRUE

I'll report that.

mikelove commented 2 years ago

Thanks for posting and following up on the bug BTW, hope we can squash it. When I tested earlier today was against a mixed installation of an older release with devel tximeta. So that may explain why it worked for me…?

AMChalkie commented 2 years ago

No worries, seems specific and detailed enough.

mikelove commented 2 years ago

In the meantime you can use skipMeta=TRUE and it just won’t attach GRanges.

maximilian-heeg commented 2 years ago

I did have the exact same issue. Downgrading restfulr to version 0.0.13 (from 0.0.14, see difference here) did resolve the problem for me.

mikelove commented 2 years ago

Thanks for reporting, could you also post to the GenomicFeatures thread, as this is related to core functionality.

mikelove commented 1 year ago

Think this has been resolved upstream, in GenomicFeatures