Closed ablaom closed 2 years ago
@ablaom. Yes, your right that's a bug. But a deep copy won't fix this bug as each thread would have the same random pattern,
Although if GLOBAL_RNG
is used then it would become threadsafe.
Yes, good obseration. However this is desirable here. For example, in a Grid search, with Holdout(rng=4), I want to use the same holdout set for each hyperparameter value I am evaluating. Otherwise I'm not comparing apples with oranges. This fix won't completely arrange this (unless the number of threads is the same as the total number of model evaluations) and won't fix a similar issue when acceleration=CPU()
, but that's essentially a bug in Resampler
from MLJBase. I'm still thinking about the best option for doing this and have not posted an issue there yet.
Update: The Resampler
bug is fixed: https://github.com/alan-turing-institute/MLJBase.jl/pull/559
Strictly speaking, as currently implemented, calling
fit!
on a resampling machinemach
mutates the value ofmach.model.resampling
ifresampling
has a RNG as a field (all of them do, I believe). It may make sense to modify this behaviour by changing the interface point for RNGs in resampling (in MLJBase). However, in the meantime, I suggest we insert a deep copy on the right hand side of https://github.com/alan-turing-institute/MLJTuning.jl/blob/db5ab433c0570641eade702db0dca55379643dbe/src/tuned_models.jl#L447cc @OkonSamuel