aylabs / JetsonNano

Hacking with the nVidia JetsonNano
GNU General Public License v3.0
2 stars 2 forks source link

Hello AI World #3

Open acs opened 5 years ago

acs commented 5 years ago

Some extra tutorials:

https://github.com/dusty-nv/jetson-inference (the ones used by the two days to a demo) https://github.com/elloza/awesome-jetson-nano (collection of links) https://devtalk.nvidia.com/default/topic/1048642/jetson-nano/links-to-jetson-nano-resources-amp-wiki/

acs commented 5 years ago

The https://developer.nvidia.com/embedded/twodaystoademo#hello_ai_world is just a landing web page. The real tutorial (hello and two days) are directly in https://github.com/dusty-nv/jetson-inference. In the README.md it is not included yet the Nano version, but it is supported. Let's follow it. Time to fork the repository!

acs commented 5 years ago

Ups, to get the full potential of GPUs you must use TensorRT from nVidia and it is not Open Source. https://devtalk.nvidia.com/default/topic/1029837/gpu-accelerated-libraries/tensorrt-source-code/post/5238639/#5238639

It seems that people train the models directly with TensorFlow and use TensorRT in the inference phase. But I need to explore more the scenarios.

The TensorRT Inference Server is Open Source, but probably it needs the TensorRT engine to be installed inside it.

acs commented 5 years ago

https://github.com/aylabs/jetson-inference/blob/master/docs/building-repo.md to compile and install the samples.

Once they are installed we can start using them to check the abilities of the hardware.

acs commented 5 years ago

But we need to complete the real pipeline: Using keras+tensorflow+tensorrt to create, train, deploy and infer.

acs commented 5 years ago
acs@nanai:~$ /home/acs/devel/jetson-inference/build/aarch64/bin/imagenet-console flores.jpg flores_dl.jpg
imagenet-console
  args (3):  0 [/home/acs/devel/jetson-inference/build/aarch64/bin/imagenet-console]  1 [flores.jpg]  2 [flores_dl.jpg]  

imageNet -- loading classification network model from:
         -- prototxt     networks/googlenet.prototxt
         -- model        networks/bvlc_googlenet.caffemodel
         -- class_labels networks/ilsvrc12_synset_words.txt
         -- input_blob   'data'
         -- output_blob  'prob'
         -- batch_size   2

[TRT]  TensorRT version 5.0.6
[TRT]  detected model format - caffe  (extension '.caffemodel')
[TRT]  desired precision specified for GPU: FASTEST
[TRT]  requested fasted precision for device GPU without providing valid calibrator, disabling INT8
[TRT]  native precisions detected for GPU:  FP32, FP16
[TRT]  selecting fastest native precision for GPU:  FP16
[TRT]  attempting to open engine cache file /home/acs/devel/jetson-inference/build/aarch64/bin/networks/bvlc_googlenet.caffemodel.2.1.GPU.FP16.engine
[TRT]  loading network profile from engine cache... /home/acs/devel/jetson-inference/build/aarch64/bin/networks/bvlc_googlenet.caffemodel.2.1.GPU.FP16.engine
[TRT]  device GPU, /home/acs/devel/jetson-inference/build/aarch64/bin/networks/bvlc_googlenet.caffemodel loaded
[TRT]  device GPU, CUDA engine context initialized with 2 bindings
[TRT]  binding -- index   0
               -- name    'data'
               -- type    FP32
               -- in/out  INPUT
               -- # dims  3
               -- dim #0  3 (CHANNEL)
               -- dim #1  224 (SPATIAL)
               -- dim #2  224 (SPATIAL)
[TRT]  binding -- index   1
               -- name    'prob'
               -- type    FP32
               -- in/out  OUTPUT
               -- # dims  3
               -- dim #0  1000 (CHANNEL)
               -- dim #1  1 (SPATIAL)
               -- dim #2  1 (SPATIAL)
[TRT]  binding to input 0 data  binding index:  0
[TRT]  binding to input 0 data  dims (b=2 c=3 h=224 w=224) size=1204224
[cuda]  cudaAllocMapped 1204224 bytes, CPU 0x100e30000 GPU 0x100e30000
[TRT]  binding to output 0 prob  binding index:  1
[TRT]  binding to output 0 prob  dims (b=2 c=1000 h=1 w=1) size=8000
[cuda]  cudaAllocMapped 8000 bytes, CPU 0x100f60000 GPU 0x100f60000
device GPU, /home/acs/devel/jetson-inference/build/aarch64/bin/networks/bvlc_googlenet.caffemodel initialized.
[TRT]  networks/bvlc_googlenet.caffemodel loaded
imageNet -- loaded 1000 class info entries
networks/bvlc_googlenet.caffemodel initialized.
loaded image  flores.jpg  (3968 x 2976)  188940288 bytes
[cuda]  cudaAllocMapped 188940288 bytes, CPU 0x101060000 GPU 0x101060000
[TRT]  layer conv1/7x7_s2 + conv1/relu_7x7 - 22.393385 ms
[TRT]  layer pool1/3x3_s2 - 3.889427 ms
[TRT]  layer pool1/norm1 input reformatter 0 - 0.635990 ms
[TRT]  layer pool1/norm1 - 0.127396 ms
[TRT]  layer conv2/3x3_reduce + conv2/relu_3x3_reduce - 0.282448 ms
[TRT]  layer conv2/3x3 + conv2/relu_3x3 - 4.519531 ms
[TRT]  layer conv2/norm2 - 0.348125 ms
[TRT]  layer pool2/3x3_s2 - 0.380104 ms
[TRT]  layer inception_3a/1x1 + inception_3a/relu_1x1 || inception_3a/3x3_reduce + inception_3a/relu_3x3_reduce || inception_3a/5x5_reduce + inception_3a/relu_5x5_reduce - 0.490521 ms
[TRT]  layer inception_3a/3x3 + inception_3a/relu_3x3 - 1.439844 ms
[TRT]  layer inception_3a/5x5 + inception_3a/relu_5x5 - 0.247604 ms
[TRT]  layer inception_3a/pool - 0.239583 ms
[TRT]  layer inception_3a/pool_proj + inception_3a/relu_pool_proj - 0.145313 ms
[TRT]  layer inception_3a/1x1 copy - 0.026666 ms
[TRT]  layer inception_3b/1x1 + inception_3b/relu_1x1 || inception_3b/3x3_reduce + inception_3b/relu_3x3_reduce || inception_3b/5x5_reduce + inception_3b/relu_5x5_reduce - 1.002917 ms
[TRT]  layer inception_3b/3x3 + inception_3b/relu_3x3 - 2.463698 ms
[TRT]  layer inception_3b/5x5 + inception_3b/relu_5x5 - 0.987083 ms
[TRT]  layer inception_3b/pool - 0.311667 ms
[TRT]  layer inception_3b/pool_proj + inception_3b/relu_pool_proj - 0.229479 ms
[TRT]  layer inception_3b/1x1 copy - 0.045208 ms
[TRT]  layer pool3/3x3_s2 - 0.260938 ms
[TRT]  layer inception_4a/1x1 + inception_4a/relu_1x1 || inception_4a/3x3_reduce + inception_4a/relu_3x3_reduce || inception_4a/5x5_reduce + inception_4a/relu_5x5_reduce - 0.781927 ms
[TRT]  layer inception_4a/3x3 + inception_4a/relu_3x3 - 0.750729 ms
[TRT]  layer inception_4a/5x5 + inception_4a/relu_5x5 - 0.134583 ms
[TRT]  layer inception_4a/pool - 0.124063 ms
[TRT]  layer inception_4a/pool_proj + inception_4a/relu_pool_proj - 0.141667 ms
[TRT]  layer inception_4a/1x1 copy - 0.022552 ms
[TRT]  layer inception_4b/1x1 + inception_4b/relu_1x1 || inception_4b/3x3_reduce + inception_4b/relu_3x3_reduce || inception_4b/5x5_reduce + inception_4b/relu_5x5_reduce - 0.817604 ms
[TRT]  layer inception_4b/3x3 + inception_4b/relu_3x3 - 0.867448 ms
[TRT]  layer inception_4b/5x5 + inception_4b/relu_5x5 - 0.180312 ms
[TRT]  layer inception_4b/pool - 0.211875 ms
[TRT]  layer inception_4b/pool_proj + inception_4b/relu_pool_proj - 0.151563 ms
[TRT]  layer inception_4b/1x1 copy - 0.019218 ms
[TRT]  layer inception_4c/1x1 + inception_4c/relu_1x1 || inception_4c/3x3_reduce + inception_4c/relu_3x3_reduce || inception_4c/5x5_reduce + inception_4c/relu_5x5_reduce - 0.581823 ms
[TRT]  layer inception_4c/3x3 + inception_4c/relu_3x3 - 0.939896 ms
[TRT]  layer inception_4c/5x5 + inception_4c/relu_5x5 - 0.180886 ms
[TRT]  layer inception_4c/pool - 0.177812 ms
[TRT]  layer inception_4c/pool_proj + inception_4c/relu_pool_proj - 0.154844 ms
[TRT]  layer inception_4c/1x1 copy - 0.017448 ms
[TRT]  layer inception_4d/1x1 + inception_4d/relu_1x1 || inception_4d/3x3_reduce + inception_4d/relu_3x3_reduce || inception_4d/5x5_reduce + inception_4d/relu_5x5_reduce - 0.562239 ms
[TRT]  layer inception_4d/3x3 + inception_4d/relu_3x3 - 1.396875 ms
[TRT]  layer inception_4d/5x5 + inception_4d/relu_5x5 - 0.215625 ms
[TRT]  layer inception_4d/pool - 0.131302 ms
[TRT]  layer inception_4d/pool_proj + inception_4d/relu_pool_proj - 0.146094 ms
[TRT]  layer inception_4d/1x1 copy - 0.016615 ms
[TRT]  layer inception_4e/1x1 + inception_4e/relu_1x1 || inception_4e/3x3_reduce + inception_4e/relu_3x3_reduce || inception_4e/5x5_reduce + inception_4e/relu_5x5_reduce - 1.376458 ms
[TRT]  layer inception_4e/3x3 + inception_4e/relu_3x3 - 0.862136 ms
[TRT]  layer inception_4e/5x5 + inception_4e/relu_5x5 - 0.407135 ms
[TRT]  layer inception_4e/pool - 0.143698 ms
[TRT]  layer inception_4e/pool_proj + inception_4e/relu_pool_proj - 0.257812 ms
[TRT]  layer inception_4e/1x1 copy - 0.027084 ms
[TRT]  layer pool4/3x3_s2 - 0.115937 ms
[TRT]  layer inception_5a/1x1 + inception_5a/relu_1x1 || inception_5a/3x3_reduce + inception_5a/relu_3x3_reduce || inception_5a/5x5_reduce + inception_5a/relu_5x5_reduce - 0.522500 ms
[TRT]  layer inception_5a/3x3 + inception_5a/relu_3x3 - 0.550625 ms
[TRT]  layer inception_5a/5x5 + inception_5a/relu_5x5 - 0.209323 ms
[TRT]  layer inception_5a/pool - 0.090000 ms
[TRT]  layer inception_5a/pool_proj + inception_5a/relu_pool_proj - 0.209011 ms
[TRT]  layer inception_5a/1x1 copy - 0.012083 ms
[TRT]  layer inception_5b/1x1 + inception_5b/relu_1x1 || inception_5b/3x3_reduce + inception_5b/relu_3x3_reduce || inception_5b/5x5_reduce + inception_5b/relu_5x5_reduce - 0.842448 ms
[TRT]  layer inception_5b/3x3 + inception_5b/relu_3x3 - 0.680260 ms
[TRT]  layer inception_5b/5x5 + inception_5b/relu_5x5 - 0.293386 ms
[TRT]  layer inception_5b/pool - 0.078437 ms
[TRT]  layer inception_5b/pool_proj + inception_5b/relu_pool_proj - 0.216146 ms
[TRT]  layer inception_5b/1x1 copy - 0.017135 ms
[TRT]  layer pool5/7x7_s1 - 0.056563 ms
[TRT]  layer loss3/classifier input reformatter 0 - 0.008542 ms
[TRT]  layer loss3/classifier - 0.299375 ms
[TRT]  layer prob - 0.024322 ms
[TRT]  layer prob output reformatter 0 - 0.012813 ms
[TRT]  layer network time - 56.505150 ms
class 0094 - 0.014771  (hummingbird)
class 0309 - 0.133179  (bee)
class 0310 - 0.030182  (ant, emmet, pismire)
class 0323 - 0.033936  (monarch, monarch butterfly, milkweed butterfly, Danaus plexippus)
class 0324 - 0.015900  (cabbage butterfly)
class 0396 - 0.016663  (lionfish)
class 0580 - 0.010429  (greenhouse, nursery, glasshouse)
class 0584 - 0.031647  (hair slide)
class 0645 - 0.148560  (maypole)
class 0716 - 0.122253  (picket fence, paling)
class 0723 - 0.122253  (pinwheel)
class 0738 - 0.034760  (pot, flowerpot)
class 0946 - 0.014999  (cardoon)
class 0985 - 0.011497  (daisy)
class 0988 - 0.014534  (acorn)
class 0989 - 0.035034  (hip, rose hip, rosehip)
class 0990 - 0.017807  (buckeye, horse chestnut, conker)
class 0991 - 0.012199  (coral fungus)
class 0995 - 0.028809  (earthstar)
imagenet-console:  'flores.jpg' -> 14.85596% class #645 (maypole)
loaded image  fontmapA.png  (256 x 512)  2097152 bytes
[cuda]  cudaAllocMapped 2097152 bytes, CPU 0x10c490000 GPU 0x10c490000
[cuda]  cudaAllocMapped 8192 bytes, CPU 0x100f62000 GPU 0x100f62000
imagenet-console:  attempting to save output image to 'flores_dl.jpg'
imagenet-console:  completed saving 'flores_dl.jpg'

shutting down...

flores flores_dl

acs commented 5 years ago

It has detected the flowers as 14.856% maypole (a pole painted and decorated with flowers, around which people traditionally dance on May Day, holding long ribbons that are attached to the top of the pole.)

Captura de pantalla de 2019-05-16 23-47-30

Not so bad :)

acs commented 5 years ago

Playing with the my-recognition sample my first WAW:

This is a photo shooted by me yesterday:

photo5983041223633514489

acs@nanai:~/devel/JetsonNano/examples/my-recognition$ ./my-recognition mina.jpg 
...
image is recognized as 'wire-haired fox terrier' (class #188) with 26.831055% confidence

OMG, it has recognized correctly that there is a dog and the breeze of the dog (mostly, because Mina is mix). The program needed 4s to complete the recognition.

acs commented 5 years ago

And what about my fishes?

photo5983041223633514498

image is recognized as 'goldfish, Carassius auratus' (class #1) with 80.615234% confidence