Having a parameter (-H, default 1000) to the threshold that determines the threshold for ignoring seeds.
Remove all seeds above the threshold from the sorted seed vector (before index vector is built) without increasing peak mem.
I guess 2 could be done by either printing the vector to file and removing them as we are reading it back in, or by iterating over the sorted vector once more and assign seeds above the threshold a specific value, that when resorted, they are placed at the end of the vector. Then remove those elements from the vector (can this be done to also free up the space of those slots?).
Something we brought up on today's discussion:
-H
, default 1000) to the threshold that determines the threshold for ignoring seeds.I guess 2 could be done by either printing the vector to file and removing them as we are reading it back in, or by iterating over the sorted vector once more and assign seeds above the threshold a specific value, that when resorted, they are placed at the end of the vector. Then remove those elements from the vector (can this be done to also free up the space of those slots?).