grf-labs / grf

Generalized Random Forests
https://grf-labs.github.io/grf/
GNU General Public License v3.0
938 stars 250 forks source link

Mention reproducibility in seed argument docstring #1368

Open erikcs opened 9 months ago

erikcs commented 9 months ago

In order for a grf forest to produce the same results on different machines, in addition to the sameseed argument, it also needs the same num.threads argument, as pointed out in the online reference

Therefore, in order to ensure consistent results, we provide the following recommendations.

Make sure arguments seed and num.threads are the same across platforms Round data to 8 significant digits)

Since this sometimes comes up (#1262) it would be good to also state this in the R documentation for the seed argument.

(making results independent of num.threads would be ideal and straight-forward to do - #1263, but unfortunately, it would be a breaking change, changing results for people who expect the old behavior, and so for now it's reasonable to just keep it this way).