Re-iterate over the input file and write reads to the output file if they are in our subsampled list.
This subsampled list should be something like a set or bitvector to allow for constant lookup time as we will do this check for every read in the input file, which could be up to tens-of-millions.
Re-iterate over the input file and write reads to the output file if they are in our subsampled list.
This subsampled list should be something like a
set
orbitvector
to allow for constant lookup time as we will do this check for every read in the input file, which could be up to tens-of-millions.