l ended up spending a fair bit of time banging my head up against this in the example below.
The examples on ?bsapply suggest that treating it as a regular expression is the intended behaviour, and give a nice demonstration of when this behaviour is useful, so I think this is the correct behaviour. But perhaps the documentation could be updated to make this clearer?
suppressPackageStartupMessages(library(BSgenome.Hsapiens.UCSC.hg38))
# I was expecting to just get the matches for chr17 but got nothing!
bsp1 <- new("BSParams",
X = BSgenome.Hsapiens.UCSC.hg38,
FUN = matchPattern,
exclude = setdiff(seqlevels(BSgenome.Hsapiens.UCSC.hg38), "chr17"))
bsapply(bsp1, pattern = "CG")
#> named list()
# Making it a regular expression gave me the desired result.
bsp2 <- bsp1
bsp2@exclude <- paste0("^", bsp1@exclude, "$")
bsapply(bsp2, pattern = "CG")
#> $chr17
#> Views on a 83257441-letter DNAString subject
#> subject: NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN...NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
#> views:
#> start end width
#> [1] 60054 60055 2 [CG]
#> [2] 60141 60142 2 [CG]
#> [3] 60168 60169 2 [CG]
#> [4] 60201 60202 2 [CG]
#> [5] 60210 60211 2 [CG]
#> ... ... ... ... ...
#> [1248324] 83245477 83245478 2 [CG]
#> [1248325] 83245632 83245633 2 [CG]
#> [1248326] 83246061 83246062 2 [CG]
#> [1248327] 83246281 83246282 2 [CG]
#> [1248328] 83247017 83247018 2 [CG]
The
exclude
slot in a BSParams is documented as:From that, I thought
bsapply()
was treated it as a string literal. In fact,bsapply()
treats it as a regular expression:https://github.com/Bioconductor/BSgenome/blob/3d109889dc90277ed85b15fc4fa7a4668c0974a1/R/bsapply.R#L78
l ended up spending a fair bit of time banging my head up against this in the example below.
The examples on
?bsapply
suggest that treating it as a regular expression is the intended behaviour, and give a nice demonstration of when this behaviour is useful, so I think this is the correct behaviour. But perhaps the documentation could be updated to make this clearer?Created on 2018-09-17 by the reprex package (v0.2.1)
Session info
``` r devtools::session_info() #> Session info ------------------------------------------------------------- #> setting value #> version R version 3.5.1 (2018-07-02) #> system x86_64, darwin15.6.0 #> ui X11 #> language (EN) #> collate en_AU.UTF-8 #> tz Australia/Melbourne #> date 2018-09-17 #> Packages ----------------------------------------------------------------- #> package * version date source #> backports 1.1.2 2017-12-13 CRAN (R 3.5.0) #> base * 3.5.1 2018-07-05 local #> Biobase 2.41.2 2018-07-18 Bioconductor #> BiocGenerics * 0.27.1 2018-06-17 Bioconductor #> BiocParallel 1.15.12 2018-09-13 Bioconductor #> Biostrings * 2.49.1 2018-08-04 Bioconductor #> bitops 1.0-6 2013-08-17 CRAN (R 3.5.0) #> BSgenome * 1.49.3 2018-07-27 Bioconductor #> BSgenome.Hsapiens.UCSC.hg38 * 1.4.1 2017-11-13 Bioconductor #> compiler 3.5.1 2018-07-05 local #> datasets * 3.5.1 2018-07-05 local #> DelayedArray 0.7.41 2018-09-14 Bioconductor #> devtools 1.13.6 2018-06-27 CRAN (R 3.5.0) #> digest 0.6.17 2018-09-12 CRAN (R 3.5.1) #> evaluate 0.11 2018-07-17 CRAN (R 3.5.0) #> GenomeInfoDb * 1.17.1 2018-05-11 Bioconductor #> GenomeInfoDbData 1.1.0 2017-12-16 Bioconductor #> GenomicAlignments 1.17.3 2018-07-18 Bioconductor #> GenomicRanges * 1.33.13 2018-08-04 Bioconductor #> graphics * 3.5.1 2018-07-05 local #> grDevices * 3.5.1 2018-07-05 local #> grid 3.5.1 2018-07-05 local #> htmltools 0.3.6 2017-04-28 CRAN (R 3.5.0) #> IRanges * 2.15.17 2018-08-24 Bioconductor #> knitr 1.20 2018-02-20 CRAN (R 3.5.0) #> lattice 0.20-35 2017-03-25 CRAN (R 3.5.1) #> magrittr 1.5 2014-11-22 CRAN (R 3.5.0) #> Matrix 1.2-14 2018-04-13 CRAN (R 3.5.1) #> matrixStats 0.54.0 2018-07-23 CRAN (R 3.5.1) #> memoise 1.1.0 2017-04-21 CRAN (R 3.5.0) #> methods * 3.5.1 2018-07-05 local #> parallel * 3.5.1 2018-07-05 local #> Rcpp 0.12.18 2018-07-23 CRAN (R 3.5.1) #> RCurl 1.95-4.11 2018-07-15 CRAN (R 3.5.0) #> rmarkdown 1.10 2018-06-11 CRAN (R 3.5.0) #> rprojroot 1.3-2 2018-01-03 CRAN (R 3.5.0) #> Rsamtools 1.33.5 2018-09-04 Bioconductor #> rtracklayer * 1.41.5 2018-08-31 Bioconductor #> S4Vectors * 0.19.19 2018-07-18 Bioconductor #> stats * 3.5.1 2018-07-05 local #> stats4 * 3.5.1 2018-07-05 local #> stringi 1.2.4 2018-07-20 CRAN (R 3.5.1) #> stringr 1.3.1 2018-05-10 CRAN (R 3.5.0) #> SummarizedExperiment 1.11.6 2018-07-17 Bioconductor #> tools 3.5.1 2018-07-05 local #> utils * 3.5.1 2018-07-05 local #> withr 2.1.2 2018-03-15 CRAN (R 3.5.0) #> XML 3.98-1.16 2018-08-19 CRAN (R 3.5.0) #> XVector * 0.21.3 2018-06-23 Bioconductor #> yaml 2.2.0 2018-07-25 CRAN (R 3.5.1) #> zlibbioc 1.27.0 2018-05-01 Bioconductor ```