other options to reduce memory use is in merge are:
use a seq sorted by tid, repeat. currently, we partition in a Table by tid, repeat, then sort by position, but we could sort by tid, then repeat, then position in 1 huge seq and then send the correct portion of that seq to cluster.
we could process unplaced reads separately. 70% of reads are unplaced and these don't help in merge for clustering. Need input from @hdashnow on how to handle this. This is like the best avenue as it could reduce memory footprint by an additional 70%.
other options to reduce memory use is in merge are:
use a seq sorted by tid, repeat. currently, we partition in a Table by tid, repeat, then sort by position, but we could sort by tid, then repeat, then position in 1 huge seq and then send the correct portion of that seq to
cluster
.we could process unplaced reads separately. 70% of reads are unplaced and these don't help in merge for clustering. Need input from @hdashnow on how to handle this. This is like the best avenue as it could reduce memory footprint by an additional 70%.