Closed danejo3 closed 1 year ago
@standage Do you know a way we can set random.seed()
to ensure that the value grabbed by randint
in the snakemake rule can always be the same value (ie. 12345)?
I've tried to add random.seed(0)
at the beginning of the pytest function but it fails to set the random seed in the snakemake rule. I've also tried to mock/patch randint
but I was unable to mock/patch the value. I've tried to google for solutions but wasn't finding anything helpful.
The only solution that I can think of is passing the seed in the config dictionary in cli.py
. By default, the value for the seed is random.seed()
and when testing, we can override this value with random.seed(0)
.
When running python code in a snakemake rule, are they in their own untouchable little world?
To capture a "paper trail" of an in-person conversation:
The test in question, as written, is pretty strict and its primary value is as a consistency check. The precise number of contigs and length of the largest contig is easy to measure, but these could easily change based on minor differences in the input (such as selecting a slightly different subset of reads) without impacting how the results are interpreted.
So I'd suggest either updating the test in question or, perhaps more preferably, adding a new test / check that asks different questions to verify the result: questions that are more robust to minor changes in the input.
Also, it certainly would make sense to allow the user to override the randomly chosen seed.
After the recent PR (#24), the
test_custom_downsample_input()
has failed because of the recent changes made in the snakemakedownsample
rule. In the rule, we addedrandint
to randomize the downsample process when runningseqtk
. Because of the random seed will be a different number every time the rule is ran, thenum_contigs
andlargest_contig
can change in thequast
report.