pytorch / csprng

Cryptographically secure pseudorandom number generators for PyTorch
https://github.com/pytorch/csprng
BSD 3-Clause "New" or "Revised" License
108 stars 25 forks source link

Support randperm #71

Closed Darktex closed 4 years ago

Darktex commented 4 years ago

To provide real DP guarantees, we need to certify that batches are also shuffled with a CSPRNG. This means supporting passing a torchcsprng generator to the DataLoader. If you try to do that, it will try calling randperm and die.

To repro:

import torchvision
import torchvision.transforms as tfms

train_ds = torchvision.datasets.CIFAR10('.', train=True, download=True, transform=tfms.ToTensor())

from torch.utils.data import DataLoader

train_dl = DataLoader(train_ds, batch_size=8, shuffle=True)

import torchcsprng as prng
generator = prng.create_random_device_generator("/dev/urandom")

train_dl = DataLoader(train_ds, batch_size=8, shuffle=True, generator=generator)

x, y = next(iter(train_dl))

Error message:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-18-22755f335b09> in <module>()
----> 1 x, y = next(iter(train_dl))
      2 x.shape

4 frames
/usr/local/lib/python3.6/dist-packages/torch/utils/data/sampler.py in __iter__(self)
    108             rand_tensor = torch.randint(high=n, size=(self.num_samples,), dtype=torch.int64, generator=self.generator)
    109             return iter(rand_tensor.tolist())
--> 110         return iter(torch.randperm(n, generator=self.generator).tolist())
    111 
    112     def __len__(self):

RuntimeError: Could not run 'aten::randperm.generator_out' with arguments from the 'UNKNOWN_TENSOR_TYPE_ID' backend. 'aten::randperm.generator_out' is only available for these backends: [CPU, CUDA, Autograd, Profiler, Tracer].
nairbv commented 4 years ago

I believe this is fixed by https://github.com/pytorch/csprng/pull/72