bcgsc / biobloom

Create Bloom filters for a given reference and then use it to categorize sequences
http://www.bcgsc.ca/platform/bioinfo/software/biobloomtools
GNU General Public License v3.0
75 stars 15 forks source link

Reproducible outputs #85

Closed vdechand closed 1 year ago

vdechand commented 1 year ago

Hello,

is there a way to make the output fastq files reproducible, except for running BBT with only 1 thread?

lcoombe commented 1 year ago

Hi @vdechand,

By "reproducible", do you mean consistent ordering of the output fastq files? The categorized reads in the output files should be the same between runs but yes, the order of those reads may vary when you use multiple threads.

vdechand commented 1 year ago

Hi Lauren,

thanks! Turns out I messed up the sorting of the categorized fastq files.... Using NGSUtils/fastqutils sort I get exact same output files as required.

lcoombe commented 1 year ago

Glad to hear it, thanks for the update!