AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.8k stars 7.97k forks source link

YOLOv4's inference performance is NOT consistent/steady ... #7329

Open chenzx opened 3 years ago

chenzx commented 3 years ago

I'm using gRPC to wrap YOLOv4 in a AI model serving python script, and deploy it in a nvidia-docker container instance which is in cloud with a tesla gpu device backend;

however, i find that the inference time is NOT consistent: normally it costs ~670ms to do detection, (VS: opencv-python CPU uses ~1.3s, darknet-CPU uses 13s, in the same docker instance), but if i constantly makes RPC calls from a test client, sometimes the inference time can improve to ~100ms.

versavel commented 3 years ago

After you instantiate the model, the first few inferences usually make take a bit longer. Is that what you are observing?

On Feb 4, 2021, at 2:36 AM, Chen Zhixiang notifications@github.com wrote:

 I'm using gRPC to wrap YOLOv4 in a AI model serving python script, and deploy it in a nvidia-docker container instance which is in cloud with a tesla gpu device backend;

however, i find that the inference time is NOT consistent: normally it costs ~670ms to do detection, (VS: opencv-python CPU uses ~1.3s, darknet-CPU uses 13s, in the same docker instance), but if i constantly makes RPC calls from a test client, sometimes the inference time can improve to ~100ms.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

chenzx commented 3 years ago

After you instantiate the model, the first few inferences usually make take a bit longer. Is that what you are observing? On Feb 4, 2021, at 2:36 AM, Chen Zhixiang @.***> wrote:  I'm using gRPC to wrap YOLOv4 in a AI model serving python script, and deploy it in a nvidia-docker container instance which is in cloud with a tesla gpu device backend; however, i find that the inference time is NOT consistent: normally it costs ~670ms to do detection, (VS: opencv-python CPU uses ~1.3s, darknet-CPU uses 13s, in the same docker instance), but if i constantly makes RPC calls from a test client, sometimes the inference time can improve to ~100ms. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

i observed perf fluctuations on inference time, not only the first calls, and doubt it's due to dynamic memory allocation behavious, but had no time to further investigate on it.