Closed MNM-TB closed 3 years ago
The amplicon file I use for matching is:
#FowrdPrim ReversePrim InserLength AuxInfo
CCATTTCAGGTGTCGTGA TTATTGACGTAGCCGTCGAG 20 LV-Lib1BC_rev
The primer sequence is usually in the beginning of the read. But in your condition, your read is beginning with 20-bp random barcode sequence, not the primer sequence. Therefore, many of your reads failed to trim the primer sequence.
To make pTrimmer compatible to your sequencing data, you need to modify the variables in the code (query.c/line8):
#define BLEN 6 ---> #define BLEN 30
Then, recompile the code.
As to your second read (NB502004:151:H7VF2BGXH:1:11101:10444:1078), pTrimmer can't locate the reverse primer sequence, that's the reason that failed to trim the primer sequence and output a series of N's.
@NB502004:151:H7VF2BGXH:1:11101:7494:1075 1:N:0:ATCTCAGG+NTCCTTAC
TTTCGGGGTGTCTATACCCCCATTTCAGGTGTCGTGACCATAAAGGCATCCTTCCAGCTCGACGGCTACGTCAA
CCATTTCAGGTGTCGTGA CTCGACGGCTACGTCAATAA
@NB502004:151:H7VF2BGXH:1:11101:10444:1078 1:N:0:ATCTCAGG+NTCCTTAC
AGATGGCGAGTTGTAAGGGCCATTTCAGGTGTCGTGATTTTCATTAGATCTGTGTGTTGGCTGTCTCTTATACAC
CCATTTCAGGTGTCGTGA
HI!
Thank you for great work with the tool. I'm evaluating it for our work in extraction molecular barcodes from NGS amplicon sequencing. In general it looks very promising, but I have one strange outcome. I get many of my reads where the trimmed sequence is replaced with N's Here are two consecutive reads, one that works as expected and one which is replaced with N's an example (For your information, the first 20 bases of every read is a random sequence in the primer to allow for high diversity and clean clustering on the Illumina NextSeq): Input:
Output: