dont getting gpu support running on jetson xavier agx

tomcatxx commented 4 years ago

Hi I installed everything as discribed and followed this guide: https://www.pyimagesearch.com/2020/02/03/how-to-use-opencvs-dnn-module-with-nvidia-gpus-cuda-and-cudnn/ to install opencv. All his exaples work with gpu support after this. Even older yolo exaples could I get running with gpu support after changing the code as discribed. The part:...

net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

print(cv2.getBuildInformation())

General configuration for OpenCV 4.4.0 =====================================
  Version control:               unknown

  Extra modules:
    Location (extra):            /xavier_ssd/opencv_contrib/modules
    Version control (extra):     unknown

  Platform:
    Timestamp:                   2020-08-28T16:39:08Z
    Host:                        Linux 4.9.140-tegra aarch64
    CMake:                       3.10.2
    CMake generator:             Unix Makefiles
    CMake build tool:            /usr/bin/make
    Configuration:               RELEASE

  CPU/HW features:
    Baseline:                    NEON FP16
      required:                  NEON
      disabled:                  VFPV3

  C/C++:
    Built as dynamic libs?:      YES
    C++ standard:                11
    C++ Compiler:                /usr/bin/c++  (ver 7.5.0)
    C++ flags (Release):         -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -fvisibility-inlines-hidden -O3 -DNDEBUG  -DNDEBUG
    C++ flags (Debug):           -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winit-self -Wsuggest-override -Wno-delete-non-virtual-dtor -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    C Compiler:                  /usr/bin/cc
    C flags (Release):           -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -O3 -DNDEBUG  -DNDEBUG
    C flags (Debug):             -fsigned-char -ffast-math -W -Wall -Werror=return-type -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wuninitialized -Winit-self -Wno-comment -Wimplicit-fallthrough=3 -Wno-strict-overflow -fdiagnostics-show-option -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections    -fvisibility=hidden -g  -O0 -DDEBUG -D_DEBUG
    Linker flags (Release):      -Wl,--gc-sections -Wl,--as-needed
    Linker flags (Debug):        -Wl,--gc-sections -Wl,--as-needed
    ccache:                      NO
    Precompiled headers:         NO
    Extra dependencies:          m pthread cudart_static -lpthread dl rt nppc nppial nppicc nppicom nppidei nppif nppig nppim nppist nppisu nppitc npps cublas cudnn cufft -L/usr/local/cuda-10.2/lib64 -L/usr/lib/aarch64-linux-gnu
    3rdparty dependencies:

  OpenCV modules:
    To be built:                 alphamat aruco bgsegm bioinspired calib3d ccalib core cudaarithm cudabgsegm cudacodec cudafeatures2d cudafilters cudaimgproc cudalegacy cudaobjdetect cudaoptflow cudastereo cudawarping cudev datasets dnn dnn_objdetect dnn_superres dpm face features2d flann freetype fuzzy gapi hfs highgui img_hash imgcodecs imgproc intensity_transform line_descriptor ml objdetect optflow phase_unwrapping photo plot python3 quality rapid reg rgbd saliency shape stereo stitching structured_light superres surface_matching text tracking ts video videoio videostab xfeatures2d ximgproc xobjdetect xphoto
    Disabled:                    world
    Disabled by dependency:      -
    Unavailable:                 cnn_3dobj cvv hdf java js julia matlab ovis python2 sfm viz
    Applications:                tests perf_tests examples apps
    Documentation:               NO
    Non-free algorithms:         YES

  GUI:
    GTK+:                        YES (ver 3.22.30)
      GThread :                  YES (ver 2.56.4)
      GtkGlExt:                  NO
    VTK support:                 NO

  Media I/O:
    ZLib:                        /usr/lib/aarch64-linux-gnu/libz.so (ver 1.2.11)
    JPEG:                        /usr/lib/aarch64-linux-gnu/libjpeg.so (ver 80)
    WEBP:                        build (ver encoder: 0x020f)
    PNG:                         /usr/lib/aarch64-linux-gnu/libpng.so (ver 1.6.34)
    TIFF:                        /usr/lib/aarch64-linux-gnu/libtiff.so (ver 42 / 4.0.9)
    JPEG 2000:                   build Jasper (ver 1.900.1)
    OpenEXR:                     build (ver 2.3.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:
    DC1394:                      YES (2.2.5)
    FFMPEG:                      YES
      avcodec:                   YES (57.107.100)
      avformat:                  YES (57.83.100)
      avutil:                    YES (55.78.100)
      swscale:                   YES (4.8.100)
      avresample:                NO
    GStreamer:                   YES (1.14.5)
    v4l/v4l2:                    YES (linux/videodev2.h)

  Parallel framework:            pthreads

  Trace:                         YES (with Intel ITT)

  Other third-party libraries:
    Lapack:                      NO
    Eigen:                       YES (ver 3.3.4)
    Custom HAL:                  YES (carotene (ver 0.0.1))
    Protobuf:                    build (3.5.1)

  NVIDIA CUDA:                   YES (ver 10.2, CUFFT CUBLAS FAST_MATH)
    NVIDIA GPU arch:             53 62 72
    NVIDIA PTX archs:

  cuDNN:                         YES (ver 8.0.0)

  OpenCL:                        YES (no extra features)
    Include path:                /xavier_ssd/opencv/3rdparty/include/opencl/1.2
    Link libraries:              Dynamic load

  Python 3:
    Interpreter:                 /usr/bin/python3 (ver 3.6.9)
    Libraries:                   /usr/lib/aarch64-linux-gnu/libpython3.6m.so (ver 3.6.9)
    numpy:                       /usr/lib/python3/dist-packages/numpy/core/include (ver 1.13.3)
    install path:                lib/python3.6/dist-packages/cv2/python-3.6

  Python (for build):            /usr/bin/python3

  Java:
    ant:                         NO
    JNI:                         NO
    Java wrappers:               NO
    Java tests:                  NO

  Install to:                    /usr/local
-----------------------------------------------------------------

however mlapi always start just with cpu support:

Aug 28 2020 20:29:44.544842 [INF] Using simple log output (default)
Aug 28 2020 20:29:44.545016 [DBG 1] Initializing face recognition with model:hog upsample:1, jitters:0
Aug 28 2020 20:29:44.545139 [DBG 1] trained file not found, reading from images and doing training...
Aug 28 2020 20:29:44.545189 [DBG 1] If you are using a GPU and run out of memory, do the training using zm_train_faces.py. In this case, other models like yolo may already take up a lot of GPU memory
Aug 28 2020 20:29:44.545260 [INF] Using simple log output (default)
Aug 28 2020 20:29:44.545339 [DBG 1] Face Recognition library load time took: 0.003 milliseconds
Aug 28 2020 20:29:44.545485 [ERR] No known faces found to train, encoding file not created
Aug 28 2020 20:29:44.545549 [DBG 1] Face Recognition training took: 0.151 milliseconds
Aug 28 2020 20:29:44.545633 [ERR] Error loading KNN model: [Errno 2] No such file or directory: './known_faces/faces.dat'
Aug 28 2020 20:29:44.545722 [INF] Using simple log output (default)
Aug 28 2020 20:29:44.546516 [INF] Using simple log output (default)
Aug 28 2020 20:29:44.546611 [DBG 1] Using CPU for detection
Aug 28 2020 20:29:44.546660 [DBG 1] Initializing Yolo
Aug 28 2020 20:29:44.546704 [DBG 2] config:./models/yolov3/yolov3.cfg, weights:./models/yolov3/yolov3.weights
Aug 28 2020 20:29:44.547058 [DBG 2] Semaphore: max:1, name:pyzm_cpu_lock, timeout:120
Aug 28 2020 20:29:44.547121 [DBG 1] Waiting for cpu lock...
Aug 28 2020 20:29:44.547501 [DBG 1] Got cpu lock for initialization...
Aug 28 2020 20:29:44.721449 [DBG 1] init lock released
Aug 28 2020 20:29:44.722410 [DBG 1] YOLO initialization (loading model from disk) took: 174.803 milliseconds
Aug 28 2020 20:29:44.722548 [INF] Using simple log output (default)
Aug 28 2020 20:29:44.722660 [INF] Using simple log output (default)
Aug 28 2020 20:29:44.722730 [DBG 1] PlateRecognizer ALPR initialized with url: https://api.platerecognizer.com/v1
INFO: --------| mlapi version:2.0.0 |--------
INFO: Starting server with max:1 processes
 * Serving Flask app "mlapi" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)

Any suggestions what I could to to enable gpu support?

pliablepixels commented 4 years ago

Your logs show mlapi is set to cpu mode. Check your mlapiconfig.ini - what is object_processor ?

tomcatxx commented 4 years ago

Shame on me my eye are fucked up. Playing for 3 days around with object detection and so on. After setting object_processor to gpu it works. Thank you very much for the hint.

Here the performance for your interest done with the exaple video and stream.py:

Aug 28 2020 23:23:53.725127 [DBG 1] |---------- YOLO (input image: 800w*450h, resized to: 416w*416h) ----------|
Aug 28 2020 23:23:53.725346 [DBG 1] Waiting for gpu detection lock...
Aug 28 2020 23:23:53.726335 [DBG 1] Got gpu lock for detection
Aug 28 2020 23:23:53.811529 [DBG 1] detect lock released
Aug 28 2020 23:23:53.812905 [DBG 1] YOLO detection took: 86.383 milliseconds
Aug 28 2020 23:23:54.202768 [DBG 1] YOLO NMS filtering took: 3.078 milliseconds
Aug 28 2020 23:23:54.205099 [DBG 2] core model detection over, got 8 objects. Now filtering
Aug 28 2020 23:23:54.205332 [DBG 3] Max object size found to be: 100%
Aug 28 2020 23:23:54.205499 [DBG 2] Converted 100% to 360000.0
Aug 28 2020 23:23:54.205657 [DBG 1] Ignoring person [242, 143, 270, 219] as conf. level 0.20213405787944794 is lower than 0.

grafik GPU is nearly sleeping :)

Btw. we should find a way to put the detection in front of zoneminder not after an event happends. Even the cheap jetson nano has a great potential about this. For example pushing frames for detection directly from zoneminder to mlapi and trigger recording based on the result. Dirty Workaround Home Assistant take picture every sec -> push to mlapi -> trigger recording based on result.

Or just think about Deepstream -> amqp broker -> (HomeAssistant/some other script ....) triger Zoneminder (my favorite idea)

pliablepixels commented 3 years ago

closing as the core issue is resolved

ZoneMinder / mlapi

dont getting gpu support running on jetson xavier agx #18