kstaats / karoo_gp

A Genetic Programming platform for Python with TensorFlow for wicked-fast CPU and GPU support.
Other
159 stars 61 forks source link

Fix seed initialization. #56

Closed ezio-melotti closed 2 years ago

ezio-melotti commented 2 years ago

This PR fixes seed initialization which is a prerequisite for test reproducibility.

Even after using the new RNG (see #50) some tests were not producing consistent results. The failure seemed to only affect the classification kernel, and after several tests I was able to fix the issue by initializing the global np.random seed using np.random.seed. Apparently this affects sklearn, and after setting it to the given seed I was able to have consistent results in the tests (see also this SO thread).

Grant also suggested to pass the RNG to sklearn.model_selection.train_test_split. This is probably a better solution, but I still need to run more tests before creating a new PR.

Note that in this PR I also set the TensorFlow seed -- it doesn't appear to be used, but it's good to have them all set to the same value for consistency.