Open randerzander opened 1 month ago
For the GPU case, the error seems to indicate that torch could not change the cuda allocator. One of the reasons this can happen when only the CPU flavor of Torch is installed without GPU support. Is it possible to check if the following command works in the environment:
import torch
torch.cuda.is_available()
I'm trying to run the PII example here.
On CPU, I get memory warnings and eventual worker deaths without producing output:
There's a longer trace, but it's just more restarting workers before the cluster shuts down.
In GPU mode, it takes some time before failing with a pytorch error: