JuliaStats / MLBase.jl

A set of functions to support the development of machine learning algorithms
MIT License
186 stars 63 forks source link

Control randomness of Kfold #56

Closed ngiann closed 1 year ago

ngiann commented 1 year ago

Amonst other reasons, it would be useful for the sake of reproducibility if we could control the randomness in Kfold's behaviour. For instance:

Kfold(10,2) # call first time
Kfold([10, 2, 7, 5, 8, 1, 9, 6, 4, 3], 2, 5.0)

Kfold(10,2) # when called again
Kfold([2, 6, 10, 8, 5, 7, 3, 1, 4, 9], 2, 5.0)

Perhaps something along the lines of Kfold(10,2; seed = 123) would be nice?

Or perhaps there are reasons that speak against controlling the randomness?

Many thanks for this wonderful package.

AsafManela commented 1 year ago

Yep. The right way to do it is to add an optional rng::RandomNumberGenerator argument to Kfold. Most of julia moved away from the global rng to this way of doing things. PRs welcome!