How to install Raspberry Pi 3?

kyuuuunmi commented 7 years ago

Hi, I'm student studying NN with your great source.

Thanks for your project, we succeeded image recognition in window and linux inspite of having trouble with many dependency problems with GPU)

Now we are plan to make a RCcar able to real time detection.

Does It can be possible installing darknet on RaspberryPi 3? (we are seriously worried about gpu's spec. Your darkent is using CUDA but Rasp doesn't have NVIDIA GPU...........it uses VIDEO CORE IV 3D Graphics core) or we consider to approach another way indirectly.

I really*3 look forward to your reply :-D

AlexeyAB commented 7 years ago

@kyuuuunmi

Hi, I think you can't use Yolo-darknet for real-time detection on RaspberryPi 3.

Build for Raspberry Pi 3: https://github.com/thomaspark-pkj/darknet-nnpack

Fork by this url works on the CPU 10 times faster than the original Yolo v2 on the CPU. (But on GPU original Yolo v2 works even 100 times faster)

Yolo uses for detection:

all cores of nVidia GPU - real-time
all cores of CPU - not real-time

RaspberryPi 3 has not nVidia GPU.

But you can pay attention to XNOR-Net (written in Lua and based on Torch) from the same authors as the Darknet Yolo, but XNOR-Net is 58x faster. XNOR-Net much more faster than others and can be used on any low-performance devices, but with some decrease of precision, so it can be used even on mobile CPU - but there is no prepared code for implementation in your custom production projects:

e2b605c8-f2c7-11e6-8b50-1fb3bdaa32d5

For object detection with speed ~10 FPS on Yolo-288x288 you should use at least nVidia Jetson TX2 (~500 TFlops) 10 Watt: https://devblogs.nvidia.com/parallelforall/jetson-tx2-delivers-twice-intelligence-edge/

Also, I think RaspberryPi 3 can't be used in production autonomous car, because this requires preformance: http://www.nvidia.com/object/drive-px.html

4 Tflops (or 12 integer tera-operations per second) - FOR AUTOCRUISE
8 Tflops (or 24 integer tera-operations per second) - FOR AUTOCHAUFFEUR
16 Tflops or more (or 48 integer tera-operations per second) - FOR FULLY AUTONOMOUS DRIVING

But may be RaspberryPi 3 can be used for a case study of autonomous prototype by using XNOR-net.

AlexeyAB commented 7 years ago

For real-time detection Yolo v2 should use one of 2 cases:

modern (middle / high-end) (server / desktop / mobile) nVidia GPU , for example 50 FPS you can achive:
- on Titan X GM200 (6 Tflops) with Yolo-544x544
- on GeForce GTX 970 (3 Tflops) with Yolo-288x288 .
nVidia Drive PX 2 for fully autonomous driving, which used in all new Tesla Cars - Model S, X and at this year will be used in new Model 3: https://blogs.nvidia.com/blog/2016/10/20/tesla-motors-self-driving/

Tesla Motors has announced that all Tesla vehicles — Model S, Model X, and the upcoming Model 3 — will now be equipped with an on-board “supercomputer” that can provide full self-driving capability. The computer delivers more than 40 times the processing power of the previous system. It runs a Tesla-developed neural net for vision, sonar, and radar processing. This in-vehicle supercomputer is powered by the NVIDIA DRIVE PX 2 AI computing platform.

For different tasks uses different configuration of Drive PX2: http://www.nvidia.com/object/drive-px.html

half of nVidia Drive PX2 - FOR AUTOCRUISE
full nVidia Drive PX2 - FOR AUTOCHAUFFEUR
multiple full nVidia Drive PX2 - FOR FULLY AUTONOMOUS DRIVING

Each full nVidia Driver PX2 (8 TFlops, 24 int-TOPS, 250 Watt) contains: https://en.wikipedia.org/wiki/Drive_PX-series#Drive_PX_2

2 x ARM-CPU (4x Denver & 8x Cortex A57)
2 x GPU (512 CUDA Pascal cores) integrated to ARM-CPUs
2x GP-106 3.5 Tflops 80-100 Watt - GeForce GTX 1060 (Notebook) on board via MXM PCIe: http://wccftech.com/nvidia-pascal-gpu-drive-px-2/

I.e. for FULLY AUTONOMOUS DRIVING you should use several devices with summary performance at least 16 TFlops-SP (48 int-TOPS) which consume 500 Watt.

nvidia-drive-px-2-specifications-768x432

Now used in Tesla Model S - 1 x MXM GPU (GP106) 3.5 TFlops - half of nVidia Drive PX2 - FOR AUTOCRUISE This is GeForce GTX 1060 (Notebook) or Quadro P3000M/P2000M.

https://teslamotorsclub.com/tmc/threads/inside-the-nvidia-px2-board-on-my-hw2-ap2-0-model-s-with-pics.91076/#post-2113392

AlexeyAB commented 7 years ago

Build for Raspberry Pi 3: https://github.com/thomaspark-pkj/darknet-nnpack

Fork by this url works on the CPU 10 times faster than the original Yolo v2 on the CPU. (But on GPU original Yolo v2 works even 100 times faster)

NNPACK was used to optimize Darknet without using a GPU. It is useful for embedded devices using ARM CPUs.

Result

Model Build Options Prediction Time (seconds)

YOLO NNPACK=1,ARM_NEON=1 7.7

YOLO NNPACK=0,ARM_NEON=0 156

Tiny-YOLO NNPACK=1,ARM_NEON=1 1.8

Tiny-YOLO NNPACK=0,ARM_NEON=0 38

Model	Build Options	Prediction Time (seconds)
YOLO	NNPACK=1,ARM_NEON=1	7.7
YOLO	NNPACK=0,ARM_NEON=0	156
Tiny-YOLO	NNPACK=1,ARM_NEON=1	1.8
Tiny-YOLO	NNPACK=0,ARM_NEON=0	38

Also how to use Yolo v2 on iOS (by using Forge: a neural network toolkit for Metal): http://machinethink.net/blog/object-detection-with-yolo/

AlexeyAB / darknet

How to install Raspberry Pi 3? #38

Result