There is a chunk of code (below) that appears in many time. It appears to be (one of) the longer steps to reproducing the data. Is there any reason why we could not subset out features (1000 at a time?) and parallelize? I am considering doing for a dataset I have with whole transcriptome data (upwards of 20k features).
Thank you. I didn't optimize this, but you can do it in parallel, I think. You could divide your features into several blocks and then do them simultaneously.
There is a chunk of code (below) that appears in many time. It appears to be (one of) the longer steps to reproducing the data. Is there any reason why we could not subset out features (1000 at a time?) and parallelize? I am considering doing for a dataset I have with whole transcriptome data (upwards of 20k features).