[nf-tests] Assure reproducibility

nf-core / deepmodeloptim

Stochastic Testing and Input Manipulation for Unbiased Learning Systems

https://nf-co.re/deepmodeloptim

MIT License

23 stars 9 forks source link

[nf-tests] Assure reproducibility #55

Open suzannejin opened 7 months ago

suzannejin commented 7 months ago

random sampling There are many random sampling methods, including random.sample, and other low level within library sampling. Setting random.seed(0) at the very beginning of a script won't work.

set operations Sets are unordered, consequently everything handled with sets are not gonna follow a certain order, and this is not controllable. However, set operations are very efficient.

Alternatives?

mathysgrapotte commented 6 months ago

This will be checked on #73 a similar solution to shuffling #70 can be done (testing a first pipeline run, saving results and checking that further pipeline runs show the same results, this can also be done with nf-tests I believe).

alessiovignoli commented 5 months ago

liked to #40

alessiovignoli commented 3 months ago

PR #166 is setting the basis for testing reproducibility. Throught the debug mode. The point is that this issue is much bigger than just checking if output are identical. Because how close to reproducible you are likely depends on the ammount of data, tha size of the model, how long until convergence in learning and the complexity of the problem.