zstephens / neat-genreads

NEAT read simulation tools
Other
95 stars 27 forks source link

No reverse strand? #28

Closed sert23 closed 4 years ago

sert23 commented 7 years ago

After aligning and SNP calling using mpileup I realized no (-) strand were being produced (reverse sequence from provided fasta). Is there an option for this?

Thanks

tloka commented 6 years ago

I have a probably related behavior for paired end reads: For every pair, the first mate is simulated on the forward strand while the second mate is always on the reverse strand. For my understanding, it should be equally distributed, so approx. half of the pairs should have the first read on the reverse strand. To clarify: I don't mean a switch from FR to RF, both mates should still show in the direction of each other. But still, the read that is sequenced first can be obtained from the reverse strand.

Example:

Current output, having all pairs in the same directional order:

read_id SAM FLAG begin position
r1/1 99 1000
r1/2 147 1400
r2/1 99 2000
r2/2 147 2400
r3/1 99 3000
r3/2 147 3400
r4/1 99 4000
r4/2 147 4400

Suggested output, having an equally distributed directional order:

read_id SAM FLAG begin position
r1/1 99 1000
r1/2 147 1400
r2/1 83 2400
r2/2 163 2000
r3/1 99 3000
r3/2 147 3400
r4/1 83 4400
r4/2 163 4000

Although it might look like this in the tabulars above, this is not only about switching the read_id since the base call qualities may also be influenced depending on the model (as the average quality of the second read might be worse than the quality of the first read). The qualities of the mates can also not just be exchanged since simulated sequencing errors might result in lower quality for the respective base call.

zstephens commented 4 years ago

fixed in 2ea9720b6683aa1e1a84d206223219414302e810 (better late than never?)