ammar-n-abbas / FoundationPoseROS2

FoundationPoseROS2 is a ROS2-integrated system for 6D object pose estimation and tracking, based on the FoundationPose architecture. It uses RealSense2 with the Segment Anything Model 2 (SAM2) framework for end-to-end, model-based, real-time pose estimation and tracking of novel objects.
MIT License
12 stars 5 forks source link

No detection with custom obj of keyboard #3

Open mrtnbm opened 4 days ago

mrtnbm commented 4 days ago

I set up a obj of my keyboard that has the right dimensions and similar colors. Still, I can't seem to get a successful pose estimation.

Here is the obj, just put into a zip to be able to upload it here: logi_wo_mtl_vert_col.zip

The mask looks like this: masks

RGB looks like this (light and therefore video quality might be bad, is this the cause maybe?):

image

I only selected this object with mouse click and pressing Enter afterwards. As soon as the tracking gui starts, I do not see any pose estimation happening.

I use Ubuntu 22.04 with 4090 Mobile GPU and conda with Python 3.10 env and Intel Realsense D435 with 5.13.0.5 FW (there is 5.16.0.1 released, did not try yet because Isaac Ros needs this FW):

CUDA Toolkit 12.5, Driver 12.6
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "NVIDIA GeForce RTX 4090 Laptop GPU" (16 GiB, sm_89, mempool enabled)

I noticed that I get some errors, right after selecting the object in segmentation mask:

QObject::moveToThread: Current thread (0x26e64200) is not the object's thread (0x2a1cca90).
Cannot move to target thread (0x26e64200)
.
. occurs for a lot of times
.

After hitting enter and starting the tracking module, the errors come up again:

[INFO] [1731086589.851515483] [pose_estimation_node]: Object 0 selected.
[reset_object()] self.diameter:0.44027072379730176, vox_size:0.02201353618986509
[reset_object()] self.pts:torch.Size([186, 3])
[reset_object()] reset done
[make_rotation_grid()] cam_in_obs:(42, 4, 4)
[make_rotation_grid()] rot_grid:(252, 4, 4)
num original candidates = 252
num of pose after clustering: 252
[make_rotation_grid()] after cluster, rot_grid:(252, 4, 4)
[make_rotation_grid()] self.rot_grid: torch.Size([252, 4, 4])
[register()] Welcome
Module Utils 64702ec load on device 'cuda:0' took 0.26 ms  (cached)
[register()] poses:(252, 4, 4)
[register()] after viewpoint, add_errs min:-1.0
/home/mb/miniconda3/envs/launcher/lib/python3.10/site-packages/torch/__init__.py:614: UserWarning: torch.set_default_tensor_type() is deprecated as of PyTorch 2.1, please use torch.set_default_dtype() and torch.set_default_device() as alternatives. (Triggered internally at ../torch/csrc/tensor/python_tensor.cpp:451.)
  _C._set_default_tensor_type(t)
[predict()] ob_in_cams:(252, 4, 4)
[predict()] self.cfg.use_normal:False
[predict()] trans_normalizer:[0.019999999552965164, 0.019999999552965164, 0.05000000074505806], rot_normalizer:0.3490658503988659
[predict()] making cropped data
[make_crop_data_batch()] Welcome make_crop_data_batch
[make_crop_data_batch()] make tf_to_crops done
[make_crop_data_batch()] render done
/home/mb/miniconda3/envs/launcher/lib/python3.10/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3526.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[make_crop_data_batch()] warp done
[make_crop_data_batch()] pose batch data done
[predict()] forward start
[predict()] forward done
[predict()] making cropped data
[make_crop_data_batch()] Welcome make_crop_data_batch
[make_crop_data_batch()] make tf_to_crops done
[make_crop_data_batch()] render done
[make_crop_data_batch()] warp done
[make_crop_data_batch()] pose batch data done
[predict()] forward start
[predict()] forward done
[predict()] making cropped data
[make_crop_data_batch()] Welcome make_crop_data_batch
[make_crop_data_batch()] make tf_to_crops done
[make_crop_data_batch()] render done
[make_crop_data_batch()] warp done
[make_crop_data_batch()] pose batch data done
[predict()] forward start
[predict()] forward done
[predict()] making cropped data
[make_crop_data_batch()] Welcome make_crop_data_batch
[make_crop_data_batch()] make tf_to_crops done
[make_crop_data_batch()] render done
[make_crop_data_batch()] warp done
[make_crop_data_batch()] pose batch data done
[predict()] forward start
[predict()] forward done
[predict()] ob_in_cams:(252, 4, 4)
[predict()] self.cfg.use_normal:False
[predict()] making cropped data
[make_crop_data_batch()] Welcome make_crop_data_batch
[make_crop_data_batch()] make tf_to_crops done
[make_crop_data_batch()] render done
[make_crop_data_batch()] pose batch data done
[find_best_among_pairs()] pose_data.rgbAs.shape[0]: 252
[predict()] forward done
[register()] final, add_errs min:-1.0
[register()] sort ids:tensor([  0,   6,  30,  72,  96, 102, 222, 132, 198, 126, 168,   3,   9,  75,  99, 105, 135, 144,  84, 120, 129, 150, 171, 174,  12,  18,  78,  90, 114, 147, 191, 207, 228,  15,  21,  81,  87,  93, 117, 123, 231,  97, 153, 204, 201,  91, 107, 173,  24,  39,  33, 175, 177, 208, 241,  27, 155, 223, 225,  45, 118, 136,
        166, 230,  36, 170, 236, 239, 111, 188, 203, 244,  42, 108, 141, 161, 184, 226, 218,  70, 181, 152, 138, 237,  58, 215,  16,  17,  19,  79,  82, 192, 221,   2,  10,  11,  20,  67,  68,  74,  80, 106, 121, 163, 176,   1,  73,  76,  77,  83,  88,  89, 214,  98, 127, 128, 156,  56, 115, 122, 172,  28,  55,  65,
         94,  95, 100, 104, 116, 119, 145, 148, 158, 238,  14,  29,  37,  38,  40,  92, 101, 124, 189, 190, 212,   5,  13,  22,  23,  25, 103, 134, 157, 185, 195, 199, 213, 233,   4,   7,   8,  34,  35,  44,  46,  64,  71,  85,  86, 125, 130, 137, 149, 151, 160, 182, 200, 211, 216, 232, 235, 245, 246,  26,  31,  41,
         49,  52, 133, 164, 180, 196, 249,  32,  43,  53,  61,  69, 131, 159, 165, 178, 179, 193, 206, 210, 217, 227, 234, 240,  47,  59,  62, 143, 154, 162, 187, 194, 202, 209, 220, 224, 242, 243, 251,  63, 110, 112, 113, 140, 146, 183, 186, 229, 250, 139, 142, 169, 205, 247,  50, 167,  51,  60,  66, 109, 197, 219,
         48,  54,  57, 248])
[register()] sorted scores:tensor([63.9375, 63.9375, 63.9375, 63.9375, 63.9375, 63.9375, 63.9375, 63.9062, 63.9062, 63.8750, 63.7500, 63.5938, 63.5625, 63.5625, 63.5625, 63.5625, 63.5625, 63.5625, 63.5312, 63.5312, 63.5312, 63.5312, 63.5312, 63.5312, 63.5000, 63.5000, 63.5000, 63.5000, 63.5000, 63.5000, 63.5000, 63.4688, 63.4688, 63.4375,
        63.4375, 63.4375, 63.4375, 63.4375, 63.4375, 63.4375, 63.4375, 63.3750, 63.3438, 63.3438, 63.3125, 63.2812, 63.2812, 63.2812, 63.2500, 63.2500, 63.2188, 63.2188, 63.2188, 63.2188, 63.2188, 63.1875, 63.1875, 63.1875, 63.1875, 63.1562, 63.1562, 63.1562, 63.1562, 63.1562, 63.1250, 63.1250, 63.1250, 63.1250,
        63.0938, 63.0938, 63.0938, 63.0625, 63.0000, 63.0000, 63.0000, 63.0000, 63.0000, 63.0000, 62.8750, 62.8438, 62.8438, 62.7812, 62.7500, 62.7500, 62.7188, 62.6875, 62.6250, 62.5938, 62.5938, 62.5938, 62.5938, 62.5938, 62.5938, 62.5625, 62.5625, 62.5625, 62.5625, 62.5625, 62.5625, 62.5625, 62.5625, 62.5625,
        62.5625, 62.5625, 62.5625, 62.5312, 62.5312, 62.5312, 62.5312, 62.5312, 62.5312, 62.5312, 62.5312, 62.5000, 62.5000, 62.5000, 62.5000, 62.4688, 62.4688, 62.4688, 62.4688, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4375, 62.4062, 62.4062,
        62.4062, 62.4062, 62.4062, 62.4062, 62.4062, 62.4062, 62.4062, 62.4062, 62.4062, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3750, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438,
        62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3438, 62.3125, 62.3125, 62.3125, 62.3125, 62.3125, 62.3125, 62.3125, 62.3125, 62.3125, 62.3125, 62.2812, 62.2812, 62.2812, 62.2812, 62.2812, 62.2812, 62.2812, 62.2812, 62.2812, 62.2812, 62.2812,
        62.2812, 62.2812, 62.2812, 62.2812, 62.2812, 62.2812, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2500, 62.2188, 62.2188, 62.2188, 62.2188, 62.2188, 62.2188, 62.2188, 62.2188, 62.2188, 62.2188, 62.1875, 62.1875, 62.1875,
        62.1875, 62.1875, 62.1562, 62.1562, 62.1250, 62.1250, 62.1250, 62.1250, 62.1250, 62.1250, 62.0938, 62.0938, 62.0938, 62.0625])
QObject::moveToThread: Current thread (0x26e64200) is not the object's thread (0x2a1cca90).
Cannot move to target thread (0x26e64200)

QObject::moveToThread: Current thread (0x26e64200) is not the object's thread (0x2a1cca90).
Cannot move to target thread (0x26e64200)

QObject::moveToThread: Current thread (0x26e64200) is not the object's thread (0x2a1cca90).
Cannot move to target thread (0x26e64200)

QObject::moveToThread: Current thread (0x26e64200) is not the object's thread (0x2a1cca90).
Cannot move to target thread (0x26e64200)

Also, I had to change trimesh.load() line to add arguments for forcing mesh: trimesh.load(mesh, force='mesh', process=False)

Otherwise I got AttributeErrors for missing vertices parameter in the scene (I only have one model in the file, do not know why trimesh added in a scene). I do not know if process=False is really necessary.

ammar-n-abbas commented 4 days ago

There is an error with the segmentation, have you changed the segmentation model?

mrtnbm commented 3 days ago

There is an error with the segmentation, have you changed the segmentation model?

Hey Ammar, thanks for your fast response!

I actually used the default settings with SAM2. I ran the conda script to install everything automatically.

Is it maybe the fault of the bad lighting in that picture? I use the realsense node for Ros2 to get the realsense topics like you did.

ammar-n-abbas commented 3 days ago

No, the output mask image does not seem to be the right one.

Have you used "sam2_b.pt" model? The one you have showed looks like the FastSAM("FastSAM-s.pt") model.

mrtnbm commented 3 days ago

I used the default script foundationpose_ros_multi.py which loaded sam2:

self.seg_model = SAM("sam2_b.pt")

That's why I don't understand why the result are so weird

ammar-n-abbas commented 2 days ago

can you show the window of "click on objects to track"

mrtnbm commented 2 days ago

can you show the window of "click on objects to track"

Thank you very much for your help. I'm not at home today, I will provide a screenshot tomorrow.

It does look a lot different though as far as I remember from a few days ago. There are no overlapping segmentations and the keyboard gets segmented cleanly in comparison to the segmentation mask with the exception of some keys being segmented as well.

mrtnbm commented 1 day ago

Hello @ammar-n-abbas,

here is the screenshot of the GUI to select the object:

image

The mask is

image

Some keys get segmented individually, but the whole keyboard itself is also segmented quite accurately.

ammar-n-abbas commented 1 day ago

Can you try the use the back of the keyboard for first segmentation then later on you can turn it around

mrtnbm commented 1 day ago

Can you try the use the back of the keyboard for first segmentation then later on you can turn it around

Unfortunately, after some tests while using the back of the keyboard it is still not tracking the object after selecting the object in the segmentation GUI.

mrtnbm commented 1 day ago

This is the mask and segmentation in selection GUI for the back of the keyboard:

image

image

I also tried different realsense D435 settings: I changed the resolution to 1280x720 on both RGB and Depth Outputs, tried different presets like HighDensity, MidDensity, HighAccuracy with no effect.