question about Trimming step in the vast-tools align

Dear Rahaleh,

In the trimming step, each read is split by default into 50-nt overlapping reads using a 25-nt window. For example, a 100-nt read would produce 3 overlapping reads (positions 1-50, 26-75, 51-100). Also, both read mates from the paired-end sequencing are pooled, if available. Then, if multiple sub-reads map to junctions, only a random one is counted (to avoid double counting). For that reason, reads need to have a special heading, which is assigned during the trimming process. This has some implications: 1) 50-nt reads are also "trimmed" (simply, the head is converted); 2) you can only use the --pretimmed option if the reads have been trimmed by vast-tools. That explains why you may be getting very different results (basically, it would not be properly recognizing the reads).

I apologize since this wasn't properly explained in the read me. I'll add a description now. However, when you have a doubt, please have a look at the Supplementary Info from the references cited in vast-tools, which have extra information.

Best, Manu

vastgroup / vast-tools

question about Trimming step in the vast-tools align #45