Closed dkgaraujo closed 1 week ago
I studied more the topic. The most insightful source was scikit-learn's treatment of the topic. In essence, setting up a random seed has non-trivial implications when cross-validation is used or in estimators such as random forests that call a random number generator with every "sub-estimator".
So in order to keep things simple, given that exact reproducibility in the documentation is not a dealbreaker, I am closing this issue.
As raised by @stephprobst, all the output that is stochastic should ideally be deterministic to avoid cluttering git diffs. This can be achieved by explicitly introducing random seed numbers in functions that have a stochastic outcome.