Hi @jdidion, this pull request fixed issue #60 and part of issue #65 (I think).
Briefly, I added an auto-trim modifier (trim.modifiers.AutoAdapterCutter) to retain only overlapped parts of the paired-end reads when --aligner insert is used and no adapter sequences are given. This modifier uses --insert-match-error-rate, --insert-max-rmp and --minimum-length to control for the behavior of trimming.
@read1/1 some text
TTATTTGTCTCCAGCTTAGACATATCGCCT
+
##HHHHHHHHHHHHHHHHHHHHHHHHHHHH
@read1/2 other text
GCTGGAGACAAATAACAGTGGAGTAGTTTT
+
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
@read2/1
CAACAGGCCACATTAGACATATCGGATGGT
+
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
@read2/2
TGTGGCCTGTTGCAGTGGAGTAACTCCAGC
+
###HHHHHHHHHHHHHHHHHHHHHHHHHHH
@read3/1
CCAACTTGATATTAATAACATTAGACA
+
HHHHHHHHHHHHHHHHHHHHHHHHHHH
@read3/2
TGTTATTAATATCAAGTTGGCAGTG
+
#HHHHHHHHHHHHHHHHHHHHHHHH
@read4/1
GACAGGCCGTTTGAATGTTGACGGGATGTT
+
HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
@read4/2
CATCCCGTCAACATTCAAACGGCCTGTCCA
+
HH############################
Several things:
as of current implementation, paired-end reads that don't pass the insert-match filters (either higher prob than random, insert-match error is higher than threshold, or the reads themselves are too short), it will return the unmodified read pairs. These reads are false negative that contains adapter sequences (as read2 in example 1), is there a better way to control for these?
Not sure if the overlap filter should be controlled by --minimum-length or --overlap?
The current implementation does not support the collection trimmed-off sequences for detection of adapters.
Hi @jdidion, this pull request fixed issue #60 and part of issue #65 (I think).
Briefly, I added an auto-trim modifier (
trim.modifiers.AutoAdapterCutter
) to retain only overlapped parts of the paired-end reads when--aligner insert
is used and no adapter sequences are given. This modifier uses--insert-match-error-rate
,--insert-max-rmp
and--minimum-length
to control for the behavior of trimming.Usage: Example 1:
Example 2:
Input:
Several things:
--minimum-length
or--overlap
?