Closed FelixErnst closed 5 years ago
@hpages new PR
I updated the Modstrings package and everything works fine. So this PR could be merged into master.
An additional comment:
I noticed during checking the package on two different systems, that the latest IRanges
version is required for Biostrings 2.51.4. The Bioc 3.8 version of IRanges
reports an error during creation of a Mask using Mask()
. The did not happen until latest two commits and the changes in the S4Vectors
package. I am not quite sure about the reason.
OK I've merged this. Thanks for sending this PR.
After the merge, I had to tweak the width()
method for character a little bit (otherwise things like SolexaQuality(99L)
would fail for me): https://github.com/Bioconductor/Biostrings/commit/85b7b1fce1979dc1b0924ba96b1734647d14fcf9
I'm not clear about this error about Mask creation that you are describing above. What error message do you get exactly? How can I reproduce it? sessionInfo()? Thanks!
Thanks for the merge.
Here is the info about the error. I think it is not really an error, but a dependency to a certain IRanges
version. Installing IRanges
from GitHub resolves the issue.
library(IRanges)
Mask(mask.width=5, start=c(2), width=c(3))
#> Error in get(name, envir = asNamespace(pkg), inherits = FALSE): object 'stopIfProblems' not found
sessionInfo()
#> R version 3.5.2 (2018-12-20)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 17763)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
#> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
#> [5] LC_TIME=German_Germany.1252
#>
#> attached base packages:
#> [1] stats4 parallel stats graphics grDevices utils datasets
#> [8] methods base
#>
#> other attached packages:
#> [1] IRanges_2.16.0 S4Vectors_0.21.13 BiocGenerics_0.28.0
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.0 digest_0.6.18 magrittr_1.5 evaluate_0.13
#> [5] highr_0.7 stringi_1.3.1 rmarkdown_1.11 tools_3.5.2
#> [9] stringr_1.4.0 xfun_0.5 yaml_2.2.0 compiler_3.5.2
#> [13] htmltools_0.3.6 knitr_1.21
Regarding 85b7b1f: On which system did you test this? Linux or Windows? During development of Modstrings I had a particular situation, in which Linux/MacOS and Windows behave differently due to UTF8 encoding capabilities or more precise the behavior towards UTF8 encoded characters.
(edit: I didn't get an error on Linux with the R dev version? Did I miss something or was it not included in the tests?, edit2: I did not run R CMD check, just the .test()
🤕 It broke for me as well, but I didn't catch it. Sorry for that)
The example you included in the comment above width()
works for me on Windows...
x <- rawToChar(as.raw(135L))
nchar(x, type="chars")
#> [1] 1
sessionInfo()
#> R version 3.5.2 (2018-12-20)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 17763)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252
#> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
#> [5] LC_TIME=German_Germany.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] compiler_3.5.2 magrittr_1.5 tools_3.5.2 htmltools_0.3.6
#> [5] yaml_2.2.0 Rcpp_1.0.0 stringi_1.3.1 rmarkdown_1.11
#> [9] highr_0.7 knitr_1.21 stringr_1.4.0 xfun_0.5
#> [13] digest_0.6.18 evaluate_0.13
... and fails on Linux. (dev version aside)
> x <- rawToChar(as.raw(135L))
> nchar(x, type="chars")
Error in nchar(x, type = "chars") : invalid multibyte string, element 1
> sessionInfo()
R Under development (unstable) (2018-12-22 r75884)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)
Matrix products: default
BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reprex_0.2.1
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 ps_1.3.0 withr_2.1.2 digest_0.6.18 R6_2.4.0 evaluate_0.13 rlang_0.3.1
[8] fs_1.2.6 callr_3.1.1 whisker_0.3-2 rmarkdown_1.11 tools_3.6.0 xfun_0.5 yaml_2.2.0
[15] compiler_3.6.0 processx_3.2.1 clipr_0.5.0 htmltools_0.3.6 knitr_1.22
Just fyi.
nchar(rawToChar(as.raw(135L)), type="chars")
will fail or not depending on the current value of LC_CTYPE
. I only tried this on my Linux machine where I have LC_CTYPE
set to en_US.UTF-8
(like on your Linux box) and it failed. On your Windows machine you have LC_CTYPE
set to German_Germany.1252
, which seems to be OK. The purpose of setting LC_CTYPE
to C
is to temporarily overwrite its current value to make sure that nchar(rawToChar(as.raw(byte)), type="chars")
will succeed for any value of byte
between 0 and 255.
BTW I'm not sure that also setting LC_COLLATE
to C
helps in that particular situation so I might remove this.
…proposed during review of the Modstrings package. This includes following changes: xscodes generic and default function added, use of the normal Views() constructor instead of the previously used Biostrings internal constructor function for the match(LR)Pattern function.
This is redo of PR #21, in which I made a mistake updating the fork during development. This lead to conflicts during merging and messed up history.