Closed BitCalSaul closed 6 months ago
@BitCalSaul
CUDA_VISIBLE_DEVICES=0 python -u /home/user/Compressor/main.py epochs=50 dividor_value=100000 dgroup_id=0 dgroups=2 model.n_channels=80 model.n_blocks=21 batch_size=6
This is the actual command that is executed.
Perhaps in your code, you manually specified GPUs with non-zero index numbers.
Yeah it seems like I specify GPU in the code, when I change the index from [1] to [0], it runs properly. Thank you
Thanks very much for your efforts. I used your Runit in my two servers and found it really useful. But I encouted an issue when i used the commit two years ago in your repo. When i run the Runit, there would be an error said "RuntimeError: CUDA error: invalid device ordinal". No matter how I change my script in config.txt, it's still this error. But when i did the same stuff in my another server, it run properly. This is my command:
python /home/user/RunIt/run_it.py --interpreter python --verbose --gpu-pool 0 1 --max-workers 2--cmd-pool /home/user/RunIt/ProjCompressor/config.txt
This is output: