Low Performance for Yolo V2 while running detection on ML Suite

Xilinx / ml-suite

Getting Started with Xilinx ML Suite

https://aws.amazon.com/marketplace/pp/B077FM2JNS

Other

335 stars 152 forks source link

Low Performance for Yolo V2 while running detection on ML Suite #62

Closed MarouanGit closed 5 years ago

MarouanGit commented 5 years ago

Hi,

I'm running the ML-Suite and YoloV2 retrained on an Amazon Instance F1 with the Xilinx Alveo AMI. I'm obtaining an average inference of 80ms/image with a quiet low accuracy (compared to 30ms/image on a GPU).

My goal is to understand why do we have this huge gap in inference time and also :

Is it a normal and expected inference time on the Alveo200 ?
If not what kind of improvements could be done ? (loading images seperatly in fpga memory for example ?)

Attached the output of the detection on a AWS F1 instance shell_results_xilinx.txt. Any help would be appreciated !

Many thanks, Cheers.

wilderfield commented 5 years ago

Your latency observations are correct for yolov2 on XDNNv2.

Yolo typically takes a 608x608 input, and hence incurs large activations, and a heavy data movement penalties. The v2 architecture stalls the systolic array while partial activations are moved off chip.

XDNNv3 will alleviate this, and we should improve to around 25-30 ms.

Stay tuned.

wilderfield commented 5 years ago

As per accuracy. The yolov2 model we have when ran on FPGA achieved 15.4% mAP as opposed to the published 21.4%. This on ms coco. We did not train our model very long, and we remove leaky relu.

MarouanGit commented 4 years ago

Hi,

I reopen this thread in order to see if the XDNNv3 has been released and if you had the opportunity to test YoloV3 on it ? How is the inference on this new architecture ?