Closed ehsanfathi77 closed 6 years ago
You should update gpus in your data as gpus = 0,1,2,3
I have that in my data. But still it just uses one GPU. were you able to utilize 4 GPUs with this implementation?
For example, cfg/coco.data includes as follows: train = coco_train.txt valid = coco_test.txt names = data/coco.names backup = backup gpus = 0,1,2,3
What's your training machine? linux or windows? In linux system, the above option is successfully applied and two GPUs are working.
It is a linux machine with 4 GPUs.
I have the same settings, gpu = 0,1,2,3. but it is just using third gpu with ~65% utilization.
Moreover it is just using one CPU for batch preparing. is your implementation using multiple CPUs too?
@ehsanfathi77 I am sorry that I did not test on 4 GPUs and multiple CPUs. I just ran the model on 2 GPUs. Even though multiple GPUs are used, the load is not balanced in my opinion. This is not my problem I think. When I use 3 or more GPUs, I will check this case. Until then, please endures.. Sorry.
we need to change the batch size in the yolo_v3.cfg. It is in the testing mode. So, uncommenting the training lines and commenting the test lines did the trick. it works pretty good on 4 GPUs and uses several CPUs but as you mentioned the load is not balanced on GPUs.
@andy-yun strangely even though I specified a number of gpus to use in .data file, I saw that my training utilized all available GPU cards. It's strange and wanting to ask if anyone has experienced this?
I successfully started training but it is just using one GPU. is there a anything except the data file that I need to change?