Open kikass13 opened 5 months ago
i tested this:
ros2 launch depthai_examples tracker_yolov4_spatial_node.launch.py
ros2 launch depthai_examples yolov4_publisher.launch.py spatial_camera:=true
and the results look fine as well.
its only with the camera.cpp generic pipeline where the results are bad
Hi, thanks for the report, could you try testing with following parameters:
nn:
i_disable_resize: false
rgb:
i_preview_size: 416
@Serafadam
thanks for your reply:
im using this config:
/oak:
ros__parameters:
### will be added via launchfile due to me not wanting to fix the path here
nn:
i_nn_config_path: PLACEHOLDER_PATH_TO_CONFIG_JSON_WHICH_WILL_BE_REPLACED_BY_LAUNCH_CONFIG
i_enable_passthrough: true
i_enable_passthrough_depth: true
i_disable_resize: false
camera:
i_nn_type: spatial
i_pipeline_dump: true
i_enable_ir: true
rgb:
i_fps: 10.0
i_resolution: 720P
i_preview_size: 416
stereo:
i_align_depth: true
i_height: 320
i_width: 320
# i_stereo_conf_threshold: 40
i_stereo_conf_threshold: 200
i_subpixel: true
i_depth_preset: HIGH_DENSITY ###Prefers density over accuracy. Less invalid depth values, but more outliers.
i_lr_check: true ### Left-Right Check or LR-Check is used to remove incorrectly calculated disparity pixels due to occlusions at object borders (Left and Right camera views are slightly different).
i_lrc_threshold: 10
i_fps: 10.0
i_align_depth: true
### added filter
i_enable_decimation_filter: true
i_decimation_filter_decimation_mode: NON_ZERO_MEDIAN ### "PIXEL_SKIPPING", "NON_ZERO_MEDIAN", "NON_ZERO_MEAN"
i_decimation_filter_decimation_factor: 4 ### default 1, max 4
i_enable_spatial_filter: true
i_spatial_filter_hole_filling_radius: 2
i_spatial_filter_alpha: 0.5
i_spatial_filter_delta: 20
i_spatial_filter_iterations: 1
i_enable_threshold_filter: true
i_threshold_filter_min_range: 400
i_threshold_filter_max_range: 10000
i_enable_speckle_filter: true
i_speckle_filter_speckle_range: 50
left:
i_publish_topic: false
i_fps: 10.0
right:
i_publish_topic: false
i_fps: 10.0
i_disable_resize: false
and i_preview_size: 416
did not work, the resulting spatial info is still bad.
i have re-written one of the examples (depthai_examples/yolov4_spatial_publisher.cpp) and have nearly 1:1 hardcoded all the yaml parameters (from my config above) into the c++ pipeline. The resulting node works fine (spatial information is correct). That's why i assume that I have configured something wrong, or something in the pipeline is not created correctly (inside the camera.cpp driver)
Hi, I'm also observing this (with the latest build of this repo; commit #9132443, Oak-D-Lite and Humble). I noticed the pose values look sometimes correct, but more often are all over the place as @kikass13 describes. Started digging in a bit and noticed that geometry_msgs/Point.msg uses float64 while your Point3f uses float32. I think this might be the culprit and explain the weird behavior. But maybe I'm wrong.
Hi, are the results the same when running through bare C++/Python code?
@Serafadam My mistake, it surely wasn't the float issue. Python examples work fine (I wrote my own much simpler Python ROS wrapper to test that properly), and I traced this down to two issues:
Using this config with Oak-D Lite and the default Yolo v4:
camera:
i_enable_imu: True
i_enable_sync: False
i_nn_type: spatial
i_pipeline_type: RGBD
rgb:
i_synced: False
i_low_bandwidth: True
i_low_bandwidth_profile: 1
i_low_bandwidth_quality: 100
i_publish_topic: True
i_publish_compressed: True
i_enable_preview: True
i_preview_size: 416
i_preview_width: 416
i_fps: 30.0
i_enable_spatial_nn: True
stereo:
i_synced: False
i_subpixel: True
i_publish_topic: True
i_publish_compressed: False
i_enable_preview: False
i_enable_feature_tracker: False
i_fps: 30.0
nn:
i_disable_resize: True # MUST BE TRUE!
i_nn_config_path: depthai_ros_driver/yolo
Launching with: ros2 launch depthai_ros_driver camera.launch.py params_file:=/ws/config/rgb-depth-yolo.yaml camera_model:=OAK-D-LITE cam_pos_z:=1.15 rectify_rgb:=False
See https://www.youtube.com/watch?v=MP_fEaKpsgI (pose flipped upside down in both cases)
Hello,
i try to use the yolotiny4 with spatial information via the camera.cpp ros node (via the camera.launch.py). The model runs and the inference results in proper classification, but the spatial information is way off. I get -3.0 to 3.0 meters in all axis (x,y,z) for the pose.position while identifying a human (myself) sitting directly in front of the camera.
Position Log while im sitting ~50cm in front of the camera
``` [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.74098539352417, y=3.4557785987854004, z=8.550938606262207) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.1852463483810425, y=0.8710846900939941, z=2.1377346515655518) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.7184200286865234, y=2.7328147888183594, z=6.706618309020996) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.9341843128204346, y=0.6809415817260742, z=1.684913992881775) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.1088430881500244, y=2.2848124504089355, z=5.607172966003418) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-3.0101494789123535, y=2.2122786045074463, z=5.429166793823242) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.1424061059951782, y=0.8395997285842896, z=2.060467004776001) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.122596263885498, y=3.054694890975952, z=7.435598850250244) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-1.7084633111953735, y=1.2556174993515015, z=3.0814192295074463) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.873324155807495, y=2.111720561981201, z=5.182386875152588) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.9825876355171204, y=0.72214275598526, z=1.7722152471542358) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-0.0, y=0.0, z=0.0) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.370492696762085, y=1.7564494609832764, z=4.2754693031311035) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-5.3529887199401855, y=3.9821012020111084, z=9.772500991821289) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-4.4880242347717285, y=3.318418264389038, z=8.14375114440918) [gesture_detection_inference_on_oakd-2] geometry_msgs.msg.Point(x=-2.341932535171509, y=1.7278892993927002, z=4.2754693031311035) ```
The resulting pose is also very noisy, so i suspect that there is something wrong with it.
I have a working solution with this example:
Here, the pipeline is obviously created manually (rather than the generic ros driver pipeline), which works rather good. The same model outputs reasonable xyz coordinates (in mm because its taken directly from the output) for
Minimal Reproducible Example
Start
and watch the output of
ros2 topic echo /oak/nn/spatial_detections
while detecting something with the cameraExpected behavior
I would expect outputs like in this example
I ran the example like this:
python3 spatial_tiny_yolo.py
Position Log (x y z) while Im sitting ~50cm in front of the camera
``` 0.1027177734375 0.04964692306518555 0.4928494567871094 0.09987322998046876 -0.17121124267578125 0.3734034118652344 -0.10212914276123047 0.017021522521972657 0.4900251159667969 0.1694300994873047 -0.2740165710449219 0.6021787719726562 -0.09702268981933594 0.015319369316101073 0.4900251159667969 0.09479540252685546 -0.16431202697753905 0.36386972045898436 -0.09282048797607421 0.028689970016479494 0.4858487548828125 0.10459590911865234 -0.16762165832519532 0.386046875 -0.04489921951293945 0.01726892852783203 0.4971475830078125 -0.07535162353515625 0.06658980560302734 0.5044801330566406 -0.043298191070556644 0.022515058517456055 0.49859698486328125 -0.04516178512573242 0.017369916915893555 0.5000548706054687 -0.04006796646118164 0.020905023574829103 0.5015213012695312 -0.02803781509399414 0.02979017448425293 0.5044801330566406 -0.03023613929748535 0.037350521087646485 0.5120322265625 -0.0316358642578125 0.03515095520019531 0.5059726867675781 -0.032600372314453126 0.03260036849975586 0.521398681640625 -0.03023613929748535 0.030236135482788085 0.5120322265625 -0.03032693862915039 0.026759061813354492 0.5135698852539062 -0.02767319107055664 0.025828311920166016 0.5311141967773437 -0.025748346328735353 0.025748346328735353 0.5294698486328125 ```
Can someone tell me why there is such a difference between the output quality? It seems like a bug to me.
I also tried setting the following parameters in the .yaml config (to make the pipeline more similar to the example)