hansenlab / bsseq

Devel repository for bsseq
36 stars 26 forks source link

bsseq incompatible with tictoc library #115

Closed JeffreyMaurer closed 2 years ago

JeffreyMaurer commented 2 years ago

I'm trying to profile my RRBS script because, as RRBS, it takes a long time to chunk through and I want to reduce that time.

I tried to use the tictoc library, but it masks shift and List. I think the fix is just replacing "shift" with "LIBRARY::shift" somewhere in the code. I tried it myself manually copy-pasting the 'strandCollapse' from BSSeq-class.R and 'setMethod("shift", "FWIRanges",...) from FWIRanges-class.R but that didn't work.

Here's a reprex:

library("DSS") # makeBSseqData
library("bsseq")
library("limma")
library("Biostrings");
library("BSgenome");
library("rtracklayer");
library("Rsamtools");
library("ShortRead");
library("chipseq")
library("data.table")
library("GenomeInfoDb")

samples = paste0("sample", 1:5)

data = list(structure(list(chr = c("chr1", "chr1", 
"chr1", "chr1", "chr1", "chr1", "chr1", "chr1", "chr1", "chr1"
), pos = c(10497L, 10525L, 10542L, 10563L, 10571L, 10577L, 10579L, 
10589L, 11300L, 11343L), N = c(66L, 66L, 66L, 66L, 66L, 66L, 
66L, 33L, 2L, 4L), X = c(66L, 54L, 66L, 66L, 66L, 52L, 54L, 1L, 
2L, 2L)), row.names = c(NA, 10L), class = "data.frame"), structure(list(
chr = c("chr1", "chr1", "chr1", "chr1", "chr1", "chr1", "chr1", 
"chr1", "chr1", "chr1"), pos = c(10497L, 10525L, 10542L, 
10563L, 10571L, 10577L, 10579L, 10589L, 11300L, 11343L), 
N = c(46L, 46L, 46L, 46L, 46L, 46L, 46L, 23L, 2L, 4L), X = c(44L, 
44L, 42L, 44L, 40L, 40L, 32L, 0L, 2L, 2L)), row.names = c(NA, 
10L), class = "data.frame"), structure(list(
chr = c("chr1", "chr1", "chr1", "chr1", "chr1", "chr1", "chr1", 
"chr1", "chr1", "chr1"), pos = c(10497L, 10525L, 10542L, 
10563L, 10571L, 10577L, 10579L, 10589L, 11300L, 11343L), 
N = c(76L, 76L, 76L, 76L, 76L, 76L, 76L, 38L, 8L, 16L), X = c(68L, 
72L, 72L, 72L, 76L, 60L, 54L, 0L, 8L, 0L)), row.names = c(NA, 
10L), class = "data.frame"), structure(list(
chr = c("chr1", "chr1", "chr1", "chr1", "chr1", "chr1", "chr1", 
"chr1", "chr1", "chr1"), pos = c(10497L, 10525L, 10542L, 
10563L, 10571L, 10577L, 10579L, 10589L, 11300L, 11343L), 
N = c(81L, 82L, 82L, 82L, 82L, 82L, 81L, 41L, 11L, 22L), 
X = c(77L, 78L, 82L, 80L, 78L, 66L, 61L, 1L, 8L, 0L)), row.names = c(NA, 
10L), class = "data.frame"), structure(list(
chr = c("chr1", "chr1", "chr1", "chr1", "chr1", "chr1", "chr1", 
"chr1", "chr1", "chr1"), pos = c(10497L, 10525L, 10542L, 
10563L, 10571L, 10577L, 10579L, 10589L, 11300L, 11343L), 
N = c(83L, 84L, 84L, 84L, 84L, 84L, 84L, 43L, 3L, 6L), X = c(79L, 
78L, 82L, 78L, 84L, 68L, 58L, 2L, 2L, 2L)), row.names = c(NA, 
10L), class = "data.frame"))

names(data) = samples

# Works
bsData = makeBSseqData(data, samples)

if (runif(1) > 0.5) {
  # Does Not Work
  tictoc::tic()
  bsData = makeBSseqData(data, samples)
  tictoc::toc()
} else {
  # Import library, mask shift and List
  library(tictoc)

  # Attaching package: ‘tictoc’
  # 
  # The following object is masked from ‘package:data.table’:
  #   
  #   shift
  # 
  # The following object is masked from ‘package:SummarizedExperiment’:
  #   
  #   shift
  # 
  # The following object is masked from ‘package:GenomicRanges’:
  #   
  #   shift
  # 
  # The following object is masked from ‘package:IRanges’:
  #   
  #   shift
  # 
  # The following object is masked from ‘package:S4Vectors’:
  #   
  #   List

  # Does not Work Either
  bsData = makeBSseqData(data, samples)
}
PeteHaitch commented 2 years ago

This is a broader problem that tictoc introduced (https://github.com/jabiru/tictoc/commit/8cf7894a25e1462da8114a8e96195710f2b6f3d1) and may affect any package that uses core Bioconductor classes and methods such as List and apparently also shift(). I first saw it reported here: https://github.com/drisso/SingleCellExperiment/issues/67

In any case, your example seems to be about DSS::makeBSseqData() which isn't part of bsseq and so this would really need to be reported to the DSS authors.

Please re-open the issue if you can you minimise your example to use code that just uses bsseq features and removes unnecessary dependencies.

FWIW I'd just use system.time() for simple timing like this and avoid tictoc entirely.