First of all, thank you for making ISS. I find it very fast and easy to use, especially because it ships with error models.
When trying it out I noted that it uses a lot of RAM, which seemed odd for a read simulator, especially since it slowly eats RAM over time. However I think I found the reason and a fix for that.
When generating reads, ISS first stores all reads in a python list in RAM. Only after generating all reads, it writes them to disk.
However, it would be much more memory efficient to write them to disk immediately after generation. So this is what I did. I moved the read generation code into a generator function reads_generator which I pass to to_fastq.
As a result, the memory usage is now small and stays constant during generation.
First of all, thank you for making ISS. I find it very fast and easy to use, especially because it ships with error models.
When trying it out I noted that it uses a lot of RAM, which seemed odd for a read simulator, especially since it slowly eats RAM over time. However I think I found the reason and a fix for that.
When generating reads, ISS first stores all reads in a python list in RAM. Only after generating all reads, it writes them to disk.
However, it would be much more memory efficient to write them to disk immediately after generation. So this is what I did. I moved the read generation code into a generator function
reads_generator
which I pass toto_fastq
.As a result, the memory usage is now small and stays constant during generation.