Closed LN-rich closed 2 years ago
Can you please provide a reproducible example? Thanks
Are you going to follow up on this? We can't help without a reproducible example.
I‘d love to but unfortunately I can’t without sharing unpublished data, so I cannot provide a reproducible example. I’ll let You if I manage to debug myself.
On 8. Sep 2022, at 19:08, Hervé Pagès @.***> wrote:
Are you going to follow up on this? We can't help without a reproducible example.
— Reply to this email directly, view it on GitHubhttps://github.com/Bioconductor/Biostrings/issues/69#issuecomment-1240988146, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUHZQR7W7JJZSQNJTSTFJE3V5IMQBANCNFSM5YAUFSAQ. You are receiving this because you authored the thread.Message ID: @.***>
Well, here's one for you, and it has nothing to do with "vmatchpattern() with max.mismatch= 1 gives less results than with max.mismatch=0" (you have no evidence of that):
library(NestLink)
reads <- DNAStringSet(c("GGACTGGGTTTTT", "ACTGGGGGACTGTTTTT","ACCCTGGGTTT"))
twoPatternReadFilter(reads, "GGACTG", "TTT", maxMismatch=0)
# $reads
# DNAStringSet object of length 2:
# width seq
# [1] 13 GGACTGGGTTTTT
# [2] 17 ACTGGGGGACTGTTTTT
#
# $patternPositions
# leftStart1 leftEnd1 rightStart2 rightEnd2
# 1 1 6 9 11
# 2 7 12 13 15
twoPatternReadFilter(reads, "GGACTG", "TTT", maxMismatch=2)
# $reads
# DNAStringSet object of length 0
#
# $patternPositions
# [1] leftStart1 leftEnd1 rightStart2 rightEnd2
# <0 rows> (or 0-length row.names)
So it looks like NestLink::twoPatternReadFilter()
uses some questionable logic to keep or drop reads. That's something you would need to discuss with the NestLink folks.
I googled it and found another incidence describing the same kind of problem I have: https://github.com/benjjneb/dada2/issues/1276 I am using the NestLink twoPatternReadFilter() function and as stated, I get more results when I have max.mismatches =0 than with any other number. Even with a max.mismatch as long as the pattern I get less results. Intrestingly the same amount as with max.mismatch =1 . Is it possible that the choice of the alternative algorithm compared to the one chosen with max.mismatch =0 raises this error?