AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.76k stars 7.96k forks source link

Training with multiple GPUs is not faster than 1 GPU??? #8823

Open aidevmin opened 1 year ago

aidevmin commented 1 year ago

I follow the guide to train my dataset with multiple GPUs, I saw speed of 2 cases is same. I use the same config

batch=64
subdivisions=32     # 16 OOM
width=512
height=512

I check GPU usage and almost GPU is used. Does Darknet support multiple GPUs?

stephanecharette commented 1 year ago

Does Darknet support multiple GPUs?

Yes. See the readme which describes what you need to do.

Note this repo is no longer supported. You should be using https://github.com/hank-ai/darknet instead.

aidevmin commented 1 year ago

@stephanecharette I followed this guide https://github.com/AlexeyAB/darknet#how-to-train-with-multi-gpu for training with multiple GPUs. Do I need change batch=#GPUs x batch_for_1GPU?

Speed of multiple GPU case is same as speed of 1 GPU case with the same config.

aidevmin commented 1 year ago

@AlexeyAB Could you help me? I use same batch, max_batches and subdivision for 1 GPU and multiple GPUs, but training time is same.

I read this issuse https://github.com/AlexeyAB/darknet/issues/1165 and @AlexeyAB you also commented to this issue. I understand that if we use multiple GPUs, we need to reduce max_batches to get better speed (because with more GPUs, more images will be processed in 1 iteration) and change lr and burnin if needed as follow https://github.com/AlexeyAB/darknet/tree/64efa721ede91cd8ccc18257f98eeba43b73a6af#how-to-train-with-multi-gpu. Is that right?

stephanecharette commented 1 year ago

Like I wrote, AlexeyAB no longer maintains this repo. You should be using the Hank.ai repo.