zstephens / neat-genreads

NEAT read simulation tools
Other
92 stars 27 forks source link

Sequencing error rate for long reads not as specified #78

Closed reinator closed 3 years ago

reinator commented 3 years ago

Hi! I specified to simulate long reads (15kb) with an error-rate (-E) of 0.10. Correct me if I am wrong, but I was expecting to obtain reads with a mean quality of Q10 (which is the representation for 0.10 probability of error). When I run FastQC, the mean quality for my reads is up to Q30, as the image shows. image

Any possible explanation?

zstephens commented 3 years ago

Greetings!

For some reason I had decided that when you rescale the error rates that it does not correspondingly rescale the quality scores. I can't remember why I did this. In the very least I'll add an option to facilitate this. Stay tuned!

zstephens commented 3 years ago

added "--rescale-qual" input option in commit 3c83f1e64bbc39159628f4ad8a56f58bd6399fcb, which when used will rescale quality scores by the same multiplier specified via -E.

Feel free to reopen this issue if it doesn't work as expected.