cavalleria / cavaface

face recognition training project(pytorch)
MIT License
459 stars 87 forks source link

GPU problems #23

Closed ReverseSystem001 closed 4 years ago

ReverseSystem001 commented 4 years ago

I have 4 GPUS. when I write GPU=[0,1,2,3] in the config.py, it will report errors. But when i write GPU=[0,1]. it works well. That's strange

cavalleria commented 4 years ago

Do you set gpu avaiable env CUDA_VISIBLE_DEVICES?

ReverseSystem001 commented 4 years ago

you mean I should run sh train.sh like: CUDA_VISIBLE_DEVICES=0,1,2,3 sh train.sh ?

cavalleria commented 4 years ago

yes, set CUDA_VISIBLE_DEVICES="0,1,2,3"

ReverseSystem001 commented 4 years ago

It seems does not work. 1

cavalleria commented 4 years ago

Can you provide your training commod?

ReverseSystem001 commented 4 years ago

pick any two from the gpus, it will works. when the gpu numbers are 3 or 4 .it will report errors like above. all of the config are default. I run : CUDA_VISIBLE_DEVICES=0,1,2,3 sh train.sh, training commod?? you mean which script?

cavalleria commented 4 years ago

please change CUDA_VISIBLE_DEVICES="0,1,2,3" in train.sh , then run command sh train.sh

ReverseSystem001 commented 4 years ago

get it. thanks