Open Vadkoz opened 1 year ago
I have the same error info. The difference is that: 1) One GPU does not work. 2) In my case, the script directly finishes without training, instead of being "stuck".
solution: "pip install ." will install 1.13. reinstall pytorch as follows. torch == 1.11.0+cu113 torchvision == 0.12.0+cu113
Trying to train on 2 A5000, but training doesnt start, just stuck here:
And memory consumption stucked too
One GPU works well. I using unchanged repo and ~500 synthdog images. 2х А5000, CUDA 11.7