Closed lhlong closed 1 year ago
Having the same problem
I have the same problem and also ran into it in the large-scale-curiousity example. It appears to be a MPI Problem. I guess it is because the GPU driver path is listed only for linux, thus it won't work on windows.
Edit: I correct myself, for RND setting GPU to 1 works!
In mpi_util.py change
line 60 to available_gpus = 1
and
line 70 to os.environ['CUDA_VISIBLE_DEVICES'] = str(1)
Seems to work for me, it started to train!
It's not necessary to change codes. Just set the enviroment variable CUDA_VISIBLE_DEVICES
on the shell.
@cuspymd Does this require that you have an nVidia GPU?
@Ploppz
it's an environment variable you can define/set; you can set it even if you don't have an nvidia gpu.
so you can change line 59 as above and then run export CUDA_VISIBLE_DEVICES=0
from the command line, and you should be good to go.
@lucaslingle @cuspymd I have a similar problem,
Traceback (most recent call last):
File "ParaRetrieval.py", line 18, in
Here I used 'SLURM_ARRAY_TASK_ID' to do the array task, I see the server system is SLURM with NHC. Could you tell me how can I fix this problem? Thank you very much!
Need to open mpi_util.py and change line 59 to: