Open AlexeyAB opened 5 years ago
This'd probably be great for my use-case actually. I'm looking for the most accurate object detector that can run around 150+ FPS on 2080Ti. currently looking at using pan2_swish_scale or something along those lines but maybe Thundernet is the answer :)
@LukeAI It seems that Thundernet is fast on CPU, but on on GPU. Just like an EfficientNet as you know.
@LukeAI @AlexeyAB hi guys, just noticed this post. I'm not familiar with Thundernet, but I believe we've already beat their results with YOLOv3 on Apple's A12 asic.
iDetection iOS app runs YOLOv3-SPP 320 (52.4 mAP @0.5) inference on vertical 4k video at 24+ FPS on iPhone Xs, and 30+ FPS on iPad pro, and anyone can download it for free and test it out themselves in seconds (repeatable results).
BTW to add some more details, this is the full yolov3-spp.weights
model (not the tiny version), trained with darknet, exported to PyTorch > ONNX > CoreML. The screenshots you see on the link are a bit outdated (we need to shoot some new ones), and show the FPS before we optimized the export pipeline for speed. A current screenshot looks like this. The FPS constantly fluctuates of course, so this screenshot happened to capture 18 FPS, but we typically see a steady state 24 FPS on iPhone Xs for at least the first 5-10 minutes, before overheating starts to become an issue and the iPhone begins throttling the neural engine. With no external cooling, after one hour of continuous running the FPS will drop to about 5-10 FPS.
BUT there's a new A13 asic coming out next month on iPhone 11, so I'm hoping this will finally push inference to 30 FPS and possibly reduce the thermal issue a bit too!! :)
Is there any official implementation available of ThunderNet where we can reproduce the results mentioned in the paper? Thanks
Without bells and whistles, our model runs at 24.1 fps on an ARM-based device.
CEM (Context Enhancement Module) and SAM (Spatial Attention Module) of ThunderNet.
CEM:
SAM:
Results: