Open wyli opened 11 months ago
Hi, I was about to open an Issue about the exact same thing. This should provide a healthy speedup, which is especially important when creating large random values for volumes (e.g. RandGaussianNoidse). The numpy docs state:
The normal, exponential and gamma generators use 256-step Ziggurat methods which are 2-10 times faster than NumPy’s default implementation in standard_normal, standard_exponential or standard_gamma.
import numpy.random
rng = np.random.default_rng()
%timeit -n 1 rng.standard_normal(100000)
%timeit -n 1 numpy.random.standard_normal(100000)
1.14 ms +- 24.1 us per loop (mean +- std. dev. of 7 runs, 1 loop each)
2.17 ms +- 37.3 us per loop (mean +- std. dev. of 7 runs, 1 loop each)
%timeit -n 1 rng.standard_exponential(100000)
%timeit -n 1 numpy.random.standard_exponential(100000)
894 us +- 23 us per loop (mean +- std. dev. of 7 runs, 1 loop each)
1.55 ms +- 31.9 us per loop (mean +- std. dev. of 7 runs, 1 loop each)
%timeit -n 1 rng.standard_gamma(3.0, 100000)
%timeit -n 1 numpy.random.standard_gamma(3.0, 100000)
2.39 ms +- 94 us per loop (mean +- std. dev. of 7 runs, 1 loop each)
4.55 ms +- 185 us per loop (mean +- std. dev. of 7 runs, 1 loop each)
In general the new Generators are much faster, as can be seen in this table in the numpy docs, showing the relative speedup over the RandomState(), which internally uses a MersenneTwister (i.e. MT19937) :
MT19937 | PCG64 | PCG64DXSM | Philox | SFC64 -- | -- | -- | -- | -- 32-bit Unsigned Ints | 96 | 162 | 160 | 96 | 175 64-bit Unsigned Ints | 97 | 171 | 188 | 113 | 218 Uniforms | 102 | 192 | 206 | 121 | 233 Normals | 409 | 526 | 541 | 471 | 684 Exponentials | 701 | 1071 | 1101 | 784 | 1179 Gammas | 207 | 250 | 266 | 227 | 281 Binomials | 100 | 123 | 122 | 111 | 138 Laplaces | 113 | 114 | 108 | 113 | 114 Poissons | 103 | 111 | 115 | 105 | 127 Overall | 159 | 219 | 225 | 174 | 251Note the large increase for the normal distribution.
I am not sure what the policy is on reproducibility of randomness across versions, but in that regard it would definitely prevent reproducibility of the randomness. But so does any change in the numpy version for example, their docs state that:
If you create a Generator with the same BitGenerator, with the same seed, perform the same sequence of method calls with the same arguments, on the same build of numpy, in the same environment, on the same machine, you should get the same stream of numbers
I would be happy to submit a PR for this if that approach seems reasonable.
thanks for the comments, I haven't looked into the details of the backward compatibility of the random seeds, for example for this integration test, will the new generator produce the same deterministic results as RandomState, any thoughts? https://github.com/Project-MONAI/MONAI/blob/9ae72e4a3218a8619d2093ea2d06a45585c8a2f5/tests/test_integration_determinism.py#L86-L91
I am not sure, but if this value 0.536134 depends on a specific output of np.RandomState then most likely it will not be the same. But multiple executions of the new random generator should produce the same results.
ok, the proposed plan looks good, randomstate is not actively maintained any more, the generator is the way to go. would be great to have your PR to update the code logic, I can trigger tests and verify the relevant integration results if needed https://github.com/Project-MONAI/MONAI/blob/dev/tests/testing_data/integration_answers.py cc @Nic-Ma @ericspod
Ok great, I'll get to work on that! On a similar note, but more general: Has it ever been considered to support the pytorch generator as well? Especially for large random tensors (e.g. Gaussian noise on large tensors) this might make quite a difference. For reference, I created this benchmark comparing the current RandomState, the "new" numpy generator and the torch random on cpu and gpu (note that GPU times are in microseconds):
[------------ np_randomstate -----------] | standard_normal 1 threads: ------------------------------ (250, 250, 250) | 266.2
(100, 100, 100) | 16.3Times are in milliseconds (ms).
[------------- np_PCG64DXSM ------------] | standard_normal 1 threads: ------------------------------ (250, 250, 250) | 167.7
(100, 100, 100) | 10.3Times are in milliseconds (ms).
[-------------- torch_cpu --------------] | standard_normal 1 threads: ------------------------------ (250, 250, 250) | 73.4
(100, 100, 100) | 4.4Times are in milliseconds (ms).
[-------------- torch_gpu --------------] | standard_normal 1 threads: ------------------------------ (250, 250, 250) | 276.6
(100, 100, 100) | 13.8Times are in microseconds (us).
This was created using:
import torch.utils.benchmark as benchmark
import torch
import numpy as np
from functools import partial
shapes = [
(250, 250, 250),
(100,100,100)
]
np_randomstate = np.random.RandomState(0)
np_PCG64DXSM = np.random.Generator(np.random.PCG64DXSM(seed=0))
torch_gen = torch.Generator(device='cpu').manual_seed(0)
torch_gpu_gen = torch.Generator(device='cuda').manual_seed(0)
gens = {
'np_randomstate': np_randomstate.standard_normal,
'np_PCG64DXSM': np_PCG64DXSM.standard_normal,
'torch_cpu': partial(torch.randn, generator=torch_gen),
'torch_gpu': partial(torch.randn, generator=torch_gpu_gen, device='cuda')
}
results = []
for shape in shapes:
for name, gen in gens.items():
t = benchmark.Timer(
stmt='gen(shape)',
globals={'gen': gen, 'shape': shape},
label=name,
sub_label=str(shape),
description='standard_normal',
num_threads=1,
)
results.append(t.blocked_autorange(min_run_time=20))
compare = benchmark.Compare(results)
compare.print()
My point being that if we wanted to support these as well (ofc not by default, especially since I'm not sure how this would behave with multiprocessing etc), it might make sense to create a Wrapper class around these that exposes a common API. This could then be used to implement the switch from RandomState to the new Generator as well. The "device" parameter of the torch implementation could then be set as a instance variable (which of course requires you to adjust that manually if required). This is ofc just an idea and I wanted to bring this out there so if sth like this is desired, we don't need to do the same work twice changing the self.R. implementations.
And on another note: I noticed that there are a few usages of np.random.* in the code still. These raise some questions:
./monai/transforms/signal/array.py:276:np.random.choice ./monai/transforms/signal/array.py:357:np.random.choice ./monai/transforms/spatial/array.py:1214:np.random.randint ./monai/transforms/spatial/dictionary.py:2347:np.random.randint ./monai/transforms/spatial/dictionary.py:698:np.random.randint ./monai/transforms/utils.py:1220:np.random.random ./monai/transforms/utils.py:246:np.random.randint ./monai/transforms/utils.py:510:np.random.randint ./monai/transforms/utils.py:552:np.random.random ./monai/transforms/utils.py:607:np.random.random
./monai/apps/deepedit/interaction.py:65:np.random.choice ./monai/apps/deepedit/transforms.py:159:np.random.choice ./monai/apps/deepedit/transforms.py:60:np.random.choice ./monai/apps/detection/transforms/dictionary.py:1305:np.random.randint ./monai/apps/nuclick/transforms.py:370:np.random.choice ./monai/apps/nuclick/transforms.py:377:np.random.choice ./monai/auto3dseg/analyzer.py:190:np.random.rand ./monai/auto3dseg/analyzer.py:191:np.random.rand ./monai/auto3dseg/analyzer.py:292:np.random.rand ./monai/auto3dseg/analyzer.py:374:np.random.rand ./monai/auto3dseg/analyzer.py:862:np.random.rand ./monai/auto3dseg/operations.py:119:np.random.rand ./monai/auto3dseg/operations.py:120:np.random.rand ./monai/auto3dseg/operations.py:121:np.random.rand ./monai/auto3dseg/operations.py:122:np.random.rand ./monai/auto3dseg/operations.py:62:np.random.rand ./monai/data/synthetic.py:142:np.random.random ./monai/data/synthetic.py:65:np.random.random ./monai/data/utils.py:125:np.random.randint ./monai/utils/misc.py:353:np.random.seed ./monai/visualize/utils.py:92:np.random.rand ./monai/visualize/utils.py:97:np.random.rand
New code should use the normal method of a Generator instance instead; please see the Quick Start.
So it seems like there is no need to migrate those immediately, but just wanted to mention that here. Since there are around ~390 usages in tests I am not listing them here.
I found usages using the following bash command: find . -name "*.py" -exec grep -o -H -n -E 'np.random.\w+' {} \; | grep -v "RandomState" | sort | uniq
If this is something that should be changed then we should go ahead and do so, but from the documentation RandomState isn't maintained but neither is it deprecated. It should stick around for a good long while, the speed up would be an advantage for sure but otherwise is there any other pressing need to make this change? From a casual look at the repo np.random
functions are used in a lot of places, and transforms have their own RandomState
object. There is a lot to be changed and unconsistent use of randomness in some places should be cleaned up too.
One other thing in the documentation is that there's no guarantee of portable behaviour between versions of Numpy, for example this means the choice of Rand*
transforms in a sequence will differ between versions of numpy which will break tests expecting a certain order. Would we have to start testing per-version with Numpy as well? I don't know what the implications would be for other transforms or laziness either, or for other reproduction areas. Generative models and metrics may also be impacted. CC @atbenmurray @csudre @marksgraham
- There are some usages of np.random. in the transforms (see below). Could it be that some of them should be converted to self.R. and for those that can't, how should we handle them? Most of them will only create single values, so I am not expecting them to be performance-critical, but I guess we should have talked about the general strategy there.
Hi @john-zielke-snkeos, for this in particular I'd say these should use self.R.*
whatever R
ends up being. Elsewhere calls to np.random.*
rely on an assumption of a global random state whose seed has possibly been set, so we possible move to using a Generator
for these which is passed by argument.
@ericspod Thanks for your reply!
for this in particular I'd say these should use self.R.* whatever R ends up being
Fixing the usage of those would introduce a breaking change regarding randomness/reproducibility, but I guess this is something that is acceptable?
For the other usages, I think both keeping them and replacing them are fine. If we end up replacing them the question is whether we should have a MONAI only random generator somewhere then. Since I think the speedup would be negligble in the cases where you only produce a small amount of randomness, I would actually be in favor of keeping those for now, and once we established a new system in the performance-critical areas worry about whether to replace the other usages.
In regards to my previous comment and the implementation plan, I'd still be interested to know whether you are also in favor of introducing a wrapper class that can either use numpy.Generator or RandomState to allow for backward compatibility as well as providing extensibility later on for pytorch generators (Implementation of those should be a separate PR then).
Downside of the wrapper class is of course that it adds some maintenance overhead since we need to expose all functions of the underlying generators, and that there is probably a minimal overhead when having another level of indirection (although I would assume this to be minimal)
Hi @john-zielke-snkeos, thanks for showing an interest in this! Just a note on consistency. I don't think that we can expect back compat, but if RandomState didn't guarantee it between numpy releases, then I think that could go in the release notes and we'd be fine. @ericspod, @Nic-Ma, @wyli?
import numpy as np
seed = 12345678
r = np.random.RandomState(seed)
g0 = np.random.default_rng(seed)
g1 = np.random.Generator(np.random.PCG64(seed))
g2 = np.random.Generator(np.random.MT19937(seed))
print(r.uniform(0, 1000000))
print(g0.uniform(0, 1000000))
print(g1.uniform(0, 1000000))
print(g2.uniform(0, 1000000))
245804.23382490256
958338.0019665658
958338.0019665658
509569.7044655211
Second note: since we can't just remove RandomState without deprecating for multiple versions, we might need to have a wrapper as discussed. I would suggest that, as Generator is the supported numpy rng class going forward, that we might want to have Randomizable wrap RandomState in a Generator adaptor rather than the other way around.
@ericspod @KumoLiu can we look at this PR ?
Hi @vikashg we had discussed this change in the past within core meetings and I think we need to return to it and consider what the approach is. I've added it to the backlog project for now but let's bring this up again at the next meeting and work towards some agreement on what the changes will be. It's going to be a large refactor so we had wanted to get 1.4 items done first and finalise the lazy transforms.
ok sounds good. Thanks @ericspod
Is your feature request related to a problem? Please describe. https://numpy.org/doc/stable/reference/random/legacy.html#numpy.random.RandomState
https://numpy.org/doc/stable/reference/random/generator.html
API differences https://numpy.org/doc/stable/reference/random/new-or-different.html#new-or-different