Closed sert23 closed 4 years ago
I have a probably related behavior for paired end reads: For every pair, the first mate is simulated on the forward strand while the second mate is always on the reverse strand. For my understanding, it should be equally distributed, so approx. half of the pairs should have the first read on the reverse strand. To clarify: I don't mean a switch from FR to RF, both mates should still show in the direction of each other. But still, the read that is sequenced first can be obtained from the reverse strand.
Example:
Current output, having all pairs in the same directional order:
read_id | SAM FLAG | begin position |
---|---|---|
r1/1 | 99 | 1000 |
r1/2 | 147 | 1400 |
r2/1 | 99 | 2000 |
r2/2 | 147 | 2400 |
r3/1 | 99 | 3000 |
r3/2 | 147 | 3400 |
r4/1 | 99 | 4000 |
r4/2 | 147 | 4400 |
Suggested output, having an equally distributed directional order:
read_id | SAM FLAG | begin position |
---|---|---|
r1/1 | 99 | 1000 |
r1/2 | 147 | 1400 |
r2/1 | 83 | 2400 |
r2/2 | 163 | 2000 |
r3/1 | 99 | 3000 |
r3/2 | 147 | 3400 |
r4/1 | 83 | 4400 |
r4/2 | 163 | 4000 |
Although it might look like this in the tabulars above, this is not only about switching the read_id since the base call qualities may also be influenced depending on the model (as the average quality of the second read might be worse than the quality of the first read). The qualities of the mates can also not just be exchanged since simulated sequencing errors might result in lower quality for the respective base call.
fixed in 2ea9720b6683aa1e1a84d206223219414302e810 (better late than never?)
After aligning and SNP calling using mpileup I realized no (-) strand were being produced (reverse sequence from provided fasta). Is there an option for this?
Thanks