Closed JanineCHEN closed 4 years ago
BTW, I also tried ython ./train.py -d 0 --identifier su3 config/su3.yaml
since I only got one GPU, got the same error msg, not sure if the number of GPUs has any thing to do with the error?
ntrain: 0
normally happenes when PyTorch could not find the data (images) to read. Could you double check that?
ntrain: 0
normally happenes when PyTorch could not find the data (images) to read. Could you double check that?
Hey, thanks for your prompt response. It was indeed a data path mishap. Sorry for the caused trouble.
Can I just ask one more question before close the issue. I got RuntimeError: CUDA out of memory.
afterwards. I am wondering what might be the minimum requirement of the GPU/GPUs for training and evaluation respectively? Thank you.
The default hyperparameter is set for a GTX 1080Ti or a GTX 2080 Ti with around 12G memory. If you only have GPUs with less memory, you can try to reduce the batch size. But the reproducibility might vary.
The default hyperparameter is set for a GTX 1080Ti or a GTX 2080 Ti with around 12G memory. If you only have GPUs with less memory, you can try to reduce the batch size. But the reproducibility might vary.
Thanks for the information!
When executing
python ./train.py -d 0,1 --identifier su3 config/su3.yaml
, I got the following issue:Any help with this would be highly appreciated!