uwmisl / poretitioner

https://misl.cs.washington.edu
Other
3 stars 1 forks source link

Reduce segmenter memory usage - low hanging fruit #30

Closed kdoroschak closed 4 years ago

kdoroschak commented 4 years ago

In order to optimize poretitioner for low-memory machines, we want to avoid reading in the entire signal & voltage dataset, and rather only look at regions where the voltage is sufficiently negative to accept a capture.

The voltage across the flowcell is the same for all channels at all times. So if we find the timepoints of these negative voltage intervals, we can read in segments of current rather than the entire time series (a ~30% reduction in memory without changing anything else, based on a 15s flip frequency).

Need to rewrite find_peptides, _find_peptides_helper, and parallel_find_peptides.

Would also be useful to profile memory use for reading the entire voltage time series into memory (and break into sub-parts if more than a few hundred MB)