mbhall88 / rasusa

Randomly subsample sequencing reads or alignments
https://doi.org/10.21105/joss.03941
MIT License
209 stars 17 forks source link

Iterate over input file and output required reads #4

Closed mbhall88 closed 5 years ago

mbhall88 commented 5 years ago

Re-iterate over the input file and write reads to the output file if they are in our subsampled list.

This subsampled list should be something like a set or bitvector to allow for constant lookup time as we will do this check for every read in the input file, which could be up to tens-of-millions.

mbhall88 commented 5 years ago

closed in https://github.com/mbhall88/rasusa/commit/307827e3178002bab8955fe65cecf746b605e91d