alyssafrazee / polyester

Bioconductor package "polyester", devel version. RNA-seq read simulator.
http://biorxiv.org/content/early/2014/12/12/006015
90 stars 51 forks source link

[feature request] Explicitly label mutated bases in simulated reads. #52

Open SaraEl-Metwally opened 6 years ago

SaraEl-Metwally commented 6 years ago

Hi, I would like to know the locations of errors in the simulated read files. Is there any way I can detect the errors locations i.e. n chars, small case letters, ..etc. Thanks!

alyssafrazee commented 6 years ago

Hi, thanks for the request! The only way right now to detect the mutated bases is by matching the output fasta file back to the input reference file based on the label of the read. The labels of the simulated reads tell you which transcript & which positions generated the reads to begin with. So the way to detect mismatches is to write a script that infers them from the fasta file.

We aren't actively adding new features in the forseeable future, but I'll keep this issue open as a feature request in case development picks back up in the future.