bioforensics / yeat

YEAT: Your Everyday Assembly Tool
Other
1 stars 0 forks source link

test_custom_downsample_input() test failing #27

Closed danejo3 closed 1 year ago

danejo3 commented 1 year ago

After the recent PR (#24), the test_custom_downsample_input() has failed because of the recent changes made in the snakemake downsample rule. In the rule, we added randint to randomize the downsample process when running seqtk. Because of the random seed will be a different number every time the rule is ran, the num_contigs and largest_contig can change in the quast report.

danejo3 commented 1 year ago

@standage Do you know a way we can set random.seed() to ensure that the value grabbed by randint in the snakemake rule can always be the same value (ie. 12345)?

I've tried to add random.seed(0) at the beginning of the pytest function but it fails to set the random seed in the snakemake rule. I've also tried to mock/patch randint but I was unable to mock/patch the value. I've tried to google for solutions but wasn't finding anything helpful.

The only solution that I can think of is passing the seed in the config dictionary in cli.py. By default, the value for the seed is random.seed() and when testing, we can override this value with random.seed(0).

When running python code in a snakemake rule, are they in their own untouchable little world?

standage commented 1 year ago

To capture a "paper trail" of an in-person conversation:

The test in question, as written, is pretty strict and its primary value is as a consistency check. The precise number of contigs and length of the largest contig is easy to measure, but these could easily change based on minor differences in the input (such as selecting a slightly different subset of reads) without impacting how the results are interpreted.

So I'd suggest either updating the test in question or, perhaps more preferably, adding a new test / check that asks different questions to verify the result: questions that are more robust to minor changes in the input.

standage commented 1 year ago

Also, it certainly would make sense to allow the user to override the randomly chosen seed.