Closed alyssafrazee closed 8 years ago
idea: automate this in function call. i.e. if count matrix is too big, serialize the simulations.
I'm facing the same problem while working with chromosomes having higher transcripts like chr2 etc. I want to know if you have found a solution to this issue ?
@shrukane -- thanks for commenting. We haven't had time to implement any of our ideas for solutions yet, but we should be able to address this soon! In the meantime, a workaround is to simulate from smaller sections of the chromosome. (e.g., break the fasta or gtf file for chromosome 2 into smaller sub-files, then run the simulate_experiment() function once for each sub-file).
Bummer, it puts an upper limit on the number of reads you can simulate as well.
Yes, since the number of reads is directly proportional to the number of nucleotides in the simulation. Similar to the solution above, If you need more reads than you can hold in memory you can run the function multiple times.
In the add_error function, the call to "unlist" means that a max of 2^31 nucleotides total can be simulated in the experiment, which limits the number of reads you can simulate (exact limit depends on read length). Reason: R can only store vectors with fewer than 2^31 entries.
I think it would be good if we could write the code differently so we don't run into the 2^31 limit quite so quickly