Add pin_memory=True when using a CUDA device to increase performance as suggested
by Torch documentation.
Add torch.no_grad() context manager in __call__() to increase performance.
Reduce memory swap between CPU and GPU by instantiating Tensor directly on the GPU device.
Improve some Warnings clarity (i.e. category and message).
Bug-fix MacOS multiprocessing. It was impossible to use in multiprocess since we were not testing whether torch
multiprocess was set properly. Now, we set it properly and raise a warning instead of an error.
pin_memory=True
when using a CUDA device to increase performance as suggested by Torch documentation.torch.no_grad()
context manager in__call__()
to increase performance.