Open jan-glx opened 1 year ago
I think the issue is that Biostrings does not consider terminal gaps in alignments to be indels. See https://github.com/Bioconductor/pwalign/issues/2.
ATCG
is retained if you specify with.Lindels = FALSE
.
subject <- DNAStringSet(c("AAATCTCTCATCG"))
trimLRPatterns(subject = subject, Lpattern = "AAATCTCTC", Rpattern = "" , with.Lindels = FALSE, max.Lmismatch = 0.3)
#> DNAStringSet object of length 1:
#> width seq
#> [1] 4 ATCG
trimLRPatterns(subject = subject, Lpattern = "AAAATCTCTC", Rpattern = "" , with.Lindels = FALSE, max.Lmismatch = 0.3)
#> DNAStringSet object of length 1:
#> width seq
#> [1] 4 ATCG
trimLRPatterns(subject = subject, Lpattern = "AATCTCTC", Rpattern = "" , with.Lindels = FALSE, max.Lmismatch = 0.3)
#> DNAStringSet object of length 1:
#> width seq
#> [1] 6 TCATCG
Possibly. But while that issue might be considered odd behavior, this here is a bug, right?
Though by I am even more confused by your last with.Lindels = FALSE
example right now.
In the following examples I'd expect
trimLRPatterns
to always leaveATCG
:Created on 2022-11-18 by the reprex package (v2.0.1)