bcgsc / NanoSim

Nanopore sequence read simulator
Other
217 stars 51 forks source link

solve reported issue #155 - fixed read lengths #157

Closed SaberHQ closed 2 years ago

SaberHQ commented 2 years ago

Thanks to Haoran who reported the issue with fixed transcriptome read length generated by NanoSim for single thread #155

The reason for this issue was that the 2-dimensional Kernel Density Estimates (KDE) were generated all at once before selecting each reference transcript. Therefore, with a single thread, it would result in generating similar read lengths. Thanks to Haoran's suggestion, I adjusted the code so that it generate the 2D KDE everytime we select a reference transcript to simulate reads from, ensuring that there will be no similar read length given a specific reference transcript.

Please note that I did some analysis on different approaches to solve this issue and I believe the one suggested by Haoran (KDE generation for each reference transcript) is simple and efficient enough.