opendatacam / opendatacam-mobile

OpenDataCam mobile app for android
https://play.google.com/store/apps/details?id=com.opendatacam
MIT License
12 stars 3 forks source link

Implementing Object detection on Android with NCNN #2

Open tdurand opened 3 years ago

tdurand commented 3 years ago

Starting to gathering some research / progress on github issues.. For future reference + help me to organize craziness in my head.

After benchmarking plenty of example apps / framework etc etc ... I came the conclusion at this time of writing / state of my knowledge that the fastest and most portable way to run YOLO on Android (and iOS if Android app sucessfull) is to use the NCNN framework from Tencent: https://github.com/Tencent/ncnn . Tencent being a company of the scale of Google in china.. The License is MIT.

There is a very nice example app showcasing several neural networks : https://github.com/cmdbug/YOLOv5_NCNN

Think of the NCNN as a framework to run a neural network (like darknet or tensorflow or pytorch).. but super optimized for Mobile phones CPU inference..

It is very very very optimized for android and iphone cpus ... I get 17 FPS for YOLOv4-tiny on a Xiaomi mi 8 (200€ phone).. for example with Tensorflow lite I think I get 2 FPS so this is crazy magic... but the interesting thing is that it also aims to support lots of platforms https://github.com/Tencent/ncnn#supported-platform-matrix , maybe to watch for the future also to run on Raspberrys, jetsons, web... Right now it is not very performance on GPUs.. but they are working on it. Another very impressive demo is that you can ship NCNN on the web via webassembly: https://github.com/nihui/ncnn-webassembly-nanodet (here it is running Nanonet, a new lightweight yolo like neural network that is almost as accurate but faster: https://github.com/RangiLyu/nanodet ..

The other super great news of being able to run YOLOv4-tiny on mobile is that you can train custom weights the same way and then convert them to NCNN "compatible" weights.. + we already know that YOLOv4-tiny is accurate enough.

The code from https://github.com/cmdbug/YOLOv5_NCNN is licensed GPLv3 but I think the actual YOLOv4-tiny C++ code that is the only part we need is available as MIT code here: https://github.com/Tencent/ncnn/blob/master/examples/yolov4.cpp .. So this is to be investigated to determine the future license of OpenDataCam mobile

Here is how to integrate it on Android:

This sounds super easy 😁, but I spent the last 2 weeks to really understand how to do this in practice.. by studying the code of the https://github.com/cmdbug/YOLOv5_NCNN , Android app development and the Cameras APIs..

The good news is that I have it mostly figured out 🎉.. I ended up doing my own "glue" code using the latest version of the CameraX api which simplifies a bit things.. and also supports things that are not supported in the example app.. like Camera/ Device orientation (was only working on portrait).. it is still a bit buggy though, the coordinates of the boxes are a bit off.. there is still some aspect ratio magic I need to figure out

I also put together a working Webview bridge using Capacitor (https://capacitorjs.com/) , and I'm able to render a HTML canvas on top of the Camera Preview which draw the boxes...

The whole things seems as performant as the Android native demo app.. so I guess the "core" Proof of concept is mostly under control now... Demo to try soon !

tdurand commented 3 years ago

Leaving this issue as documentation , also dropping this link here which discuss some of the existing neural network inference framework out there for mobile devices, maybe good to add to the future documentation: https://qengineering.eu/deep-learning-software-for-raspberry-pi-and-alternatives.html

tdurand commented 3 years ago

In order to avoid Licensing the mobile app at GPLv3 ( by using https://github.com/cmdbug/YOLOv5_NCNN ) , need to rework the YOLO implementation by using:

Which are BSD licenced.. compatible with MIT

tdurand commented 3 years ago

Some more notes on this: