CGATOxford / UMI-tools

Tools for handling Unique Molecular Identifiers in NGS data sets
MIT License
481 stars 190 forks source link

Using group/dedup --random-seed and PYTHONHASHSEED=0 #479

Closed alexander-e-f-smith closed 3 years ago

alexander-e-f-smith commented 3 years ago

Hi. To keep outputs consistent for umitools group or dedup, I'm setting both PYTHONHASHSEED=0 and using the umitools --random-seed option (arbitrarily at 100). To your knowledge, would there be any drawbacks/negatives from doing this in regular practise (production NGS pipelines), rather than just in pipeline testing/validations??

IanSudbery commented 3 years ago

I don't personally see any problem.