gbr1 / ros_openvino

A ROS package to wrap openvino inference engine and get it working with Myriad and GPU
https://gbr1.github.io
GNU Affero General Public License v3.0
57 stars 13 forks source link

unable to include AlexNet #5

Open robbieKay opened 4 years ago

robbieKay commented 4 years ago

I tried to change the CNN that is being used for object recognition. I am using the Intel RealSense Camera and the VPU NCS2 on Ubuntu 16.04 (LTS). The original, unmodified package works fine.

Steps taken:

  1. Using the unmodified ros_openvino package
  2. Download the bvlc_alexnet from the Intel-Website
  3. Using the Model Optimizer with the given parameters (here) and generate the IR
  4. Place the IR in the models - folder
  5. Change the file params so that they fit to the folder I created
  6. Starting the modified package

Issue:

Terminal output as follows:

[object_detection-4] process has died [pid 20105, exit code 255, cmd /home/USERNAME/catkin_ws/devel/lib/ros_openvino/object_detection /object_detection/input_image:=/camera/color/image_raw /object_detection/input_depth:=/camera/aligned_depth_to_color/image_raw /object_detection/camera_info:=/camera/aligned_depth_to_color/camera_info __name:=object_detection __log:=/home/USERNAME/.ros/log/da83b362-b530-11ea-87a9-e454e8a1df6c/object_detection-4.log].

No output image including bounding boxes visible in RVIZ. But: depth image is visible (no boxes) in RVIZ.

All the best, Robert

gbr1 commented 4 years ago

It seems releated to the size of the image, check here: learnopencv alexnet

robbieKay commented 4 years ago

Sounds interesting! So I checked the terminal output when launching the (untouched) package and I found out that the images coming from the sensor are 640x480 pixels. From my understanding and if I am not mistaking, mobilenet-ssd needs 300x300 pixels. But your are not performing a resize operation in the code, right?

Terminal output: [ INFO] [1593091443.892518914]: depth stream is enabled - width: 640, height: 480, fps: 30, Format: Z16 [ INFO] [1593091443.893364021]: infra1 stream is enabled - width: 640, height: 480, fps: 30, Format: Y8 [ INFO] [1593091443.894154060]: infra2 stream is enabled - width: 640, height: 480, fps: 30, Format: Y8 25/06 15:24:03,895 WARNING [140117586990848] (backend-v4l2.cpp:1208) Pixel format 36315752-1a66-a242-9065-d01814a likely requires patch for fourcc code RW16! [ INFO] [1593091443.908045247]: color stream is enabled - width: 640, height: 480, fps: 30, Format: RGB8

robbieKay commented 4 years ago

Small addition: disabling depth_analysis does not affect the outcome, I get the same error message as mentioned in the first entry of this issue

gbr1 commented 4 years ago

Ok, probably I found correct issues: 1) alexnet requires resize to 227x227, so crop and resize in needed; 2) output format is different between mobilenet and alexnet, so the output must be changed.

For object detection and depth analysis mobilenet is better. I think that creation of a new node called just classifier (it should be compatible also with ImageNet and GoogleNet) could be the easiest solution. I suppose that depth analysis will not works correctly with a small frame.

gbr1 commented 4 years ago

Here you can find two examples to take a look at differences: mobilenet & alexnet