lawremi / rtracklayer

R interface to genome annotation files and the UCSC genome browser
Other
26 stars 16 forks source link

Explicitly use 'BiocGenerics::connection' to prevent warning message … #79

Closed svengato closed 8 months ago

svengato commented 1 year ago

…'Found more than one class "connection" in cache'

svengato commented 1 year ago

This annoying warning message occurs when another loaded package has a connection() method. Example:

library(rtracklayer)
library(RJSONIO)
gff <- readGFF("inst/tests/genes.gff3")
svengato commented 8 months ago

Any comments on this? It has been almost a year -

lawremi commented 8 months ago

Thanks for this and sorry for missing it.

hpages commented 8 months ago

@svengato @lawremi @sanchit-saini

Did someone bother to test this? This change breaks readGFF():

library(rtracklayer)
test_gff3 <- system.file("tests",  "genes.gff3", package = "rtracklayer")
con <- file(test_gff3)
df <- readGFF(con)
# Error in .make_filexp_from_filepath(filepath) : 
#   'filepath' must be a single string or a connection

Plus it doesn't seem to address the original issue either:

library(RJSONIO)
df <- readGFF(test_gff3)
# Found more than one class "file" in cache; using the first, from namespace 'BiocGenerics'
# Also defined by ‘RJSONIO’
# Found more than one class "file" in cache; using the first, from namespace 'BiocGenerics'
# Also defined by ‘RJSONIO’

I don't understand how something like is(filepath, "BiocGenerics::connection") would even be expected to do the right thing.

Can we please revert ASAP? The 3.18 release is coming soon!

hpages commented 8 months ago

Something that seems to work is to use inherits() instead of is():

library(BiocGenerics)
library(RJSONIO)

test_gff3 <- system.file("tests",  "genes.gff3", package = "rtracklayer")

is(test_gff3, "connection")
# Found more than one class "connection" in cache; using the first, from namespace 'BiocGenerics'
# Also defined by ‘RJSONIO’
# [1] FALSE

inherits(test_gff3, "connection")
# [1] FALSE

con <- file(test_gff3)

is(con, "connection")
# Found more than one class "file" in cache; using the first, from namespace 'BiocGenerics'
# Also defined by ‘RJSONIO’
# Found more than one class "connection" in cache; using the first, from namespace 'BiocGenerics'
# Also defined by ‘RJSONIO’
[1] TRUE

inherits(con, "connection")
# [1] TRUE

Would you take a new PR @svengato @lawremi?

svengato commented 8 months ago

Would you take a new PR @svengato @lawremi?

Sure, reversion would be prudent given these issues. (readGFF() works for my test cases, so I did not notice.)

As far as I can tell, the problem with file() is unrelated, it is in the XVector package and not easily fixed.

hpages commented 8 months ago

readGFF() works for my test cases, so I did not notice.

A good habit is to do standard QA by running R CMD build + R CMD check on a package before submitting a PR. This would have told you about the problem.

@svengato @lawremi I went ahead and added the fix to PR #97.

svengato commented 8 months ago

Just tested your fix. It looks good to me, though it is really up to you and @lawremi. Thanks @hpages!

The Found more than one class "file" in cache warning is still present, though as I mentioned above, that seems to be coming from ambiguous file() calls in XVector (nothing to do with rtracklayer). If I figure it out, I will submit a pull request there.

hpages commented 8 months ago

Thanks for the catch. One more tweak was actually needed in XVector. See https://github.com/Bioconductor/XVector/commit/62425d8193f55235cf6946f976723398c2005b55

With XVector 0.41.2, no more annoying message:

library(rtracklayer)
library(RJSONIO)
test_gff3 <- system.file("tests",  "genes.gff3", package = "rtracklayer")
df <- readGFF(test_gff3)  # no more message!

sessionInfo():

R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 23.10

Matrix products: default
BLAS:   /home/hpages/R/R-4.3.0/lib/libRblas.so 
LAPACK: /home/hpages/R/R-4.3.0/lib/libRlapack.so;  LAPACK version 3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] RJSONIO_1.3-1.8      rtracklayer_1.61.2   GenomicRanges_1.53.2
[4] GenomeInfoDb_1.37.6  IRanges_2.35.3       S4Vectors_0.39.3    
[7] BiocGenerics_0.47.1 

loaded via a namespace (and not attached):
 [1] crayon_1.5.2                DelayedArray_0.27.10       
 [3] SummarizedExperiment_1.31.1 GenomicAlignments_1.37.0   
 [5] rjson_0.2.21                RCurl_1.98-1.12            
 [7] Biostrings_2.69.2           XML_3.99-0.14              
 [9] MatrixGenerics_1.13.1       Biobase_2.61.0             
[11] grid_4.3.0                  restfulr_0.0.15            
[13] abind_1.4-5                 bitops_1.0-7               
[15] yaml_2.3.7                  compiler_4.3.0             
[17] codetools_0.2-19            XVector_0.41.2             
[19] BiocParallel_1.35.4         lattice_0.21-9             
[21] SparseArray_1.1.12          BiocIO_1.11.0              
[23] parallel_4.3.0              GenomeInfoDbData_1.2.11    
[25] Matrix_1.6-1.1              tools_4.3.0                
[27] matrixStats_1.0.0           Rsamtools_2.17.0           
[29] zlibbioc_1.47.0             S4Arrays_1.1.6             
svengato commented 8 months ago

That solves the problem, thanks!