Right now the standard random number generators of NumPy: rng = np.random.default_rng(seed=) do not work when passed to ProbabilisticMetricSpace or RasterMetricSpace. Only the legacy ones do (equivalent of np.random.seed() now defined as np.random.RandomState), but they are probably not that useful in our case (we don't need to exactly reproduce random sampling from old scripts). And, the legacy versions leak a lot of memory when using a random choice without replacement, which is exactly what we use: https://github.com/numpy/numpy/issues/14169.
So for instance, if we only want to use 10,000 samples from 1 billion for the variogram estimation, the legacy version will still create an array of 1 billion points in the background using tons of RAM :sweat_smile:.
Right now the standard random number generators of NumPy:
rng = np.random.default_rng(seed=)
do not work when passed toProbabilisticMetricSpace
orRasterMetricSpace
. Only the legacy ones do (equivalent ofnp.random.seed()
now defined asnp.random.RandomState
), but they are probably not that useful in our case (we don't need to exactly reproduce random sampling from old scripts). And, the legacy versions leak a lot of memory when using a random choice without replacement, which is exactly what we use: https://github.com/numpy/numpy/issues/14169.So for instance, if we only want to use 10,000 samples from 1 billion for the variogram estimation, the legacy version will still create an array of 1 billion points in the background using tons of RAM :sweat_smile:.
Will try to fix this at the same time as #178!