Open mberr opened 2 months ago
I was thinking I could check the element sizes with footprint = x_batch.nelement() * x_batch.element_size(), but this doesn't seem to match up to the error, so I guess I'm counting wrong
torch.cdist
is calculating the pairwise distances, so its result shape is x.shape[0] * y.shape[0]
, and indeed we have
21_474 * 200_000 = 4_294_800_000
2**32 = 4_294_967_296
21_475 * 200_000 = 4_295_000_000
Thanks for debugging!
I'll just update the example to something small enough, although it is a bit unfortunate that this raises a SEGFAULT
rather than causing a catch-able exception. I also need to add this somewhere to the documentation so a user encountering the issue has some starting point.
Hm, looks like the segfault still occurs.
This discussion might be related: https://discuss.pytorch.org/t/segmentation-fault-with-pytorch-2-3/203381
Segfault is getting emitted by the cdist function, can get reproduced on my machine with:
Here's the segfault:
I was thinking I could check the element sizes with
footprint = x_batch.nelement() * x_batch.element_size()
, but this doesn't seem to match up to the error, so I guess I'm counting wrong