ros-perception / vision_msgs

Algorithm-agnostic computer vision message types for ROS.
Apache License 2.0
149 stars 72 forks source link

BoundingBox3D description unclear #101

Open ryan-roche opened 1 month ago

ryan-roche commented 1 month ago

I'm having trouble understanding how exactly to use the fields within the BoundingBox3D message.

What direction relative to the box should the orientation pose be pointing? Should it be along one specific axis, and the rest are relative to it by the right-hand rule?

What do the X, Y, and Z values of the size field represent? Are they the full length/height/width values measured face-to-face on the box, or are they the half-values (the distance between the center and the corresponding faces)?

I suggest editing the comments and official documentation to clarify some standard for what those values represent to ensure that different projects using this message type are entering the correct measurements.

mintar commented 1 month ago

What direction relative to the box should the orientation pose be pointing?

The pose specifies the pose of the bounding box in the header.frame_id frame.

What do the X, Y, and Z values of the size field represent?

The size represents the full size of the bbox (face to face). The x value is the size of the bbox along the x axis of the local coordinate frame of the bbox as defined by the pose and so on for the other axes.

I hope that clarifies things. If not, tell me and I'll try to draw it tomorrow.

ryan-roche commented 1 month ago

I think I understand. The x/y/z values of the size represent the side lengths along the corresponding axes in the LOCAL coordinate frame of the bounding box that is obtained by applying the quaternion transformation to the header.frame_id origin axes

mintar commented 1 month ago

Yes, exactly! Not just the quaternion rotation of course, but the full transform including the translation. But I guess that's what you meant.

ryan-roche commented 3 weeks ago

Just to confirm, the positive X direction of the bounding box's local coordinate system is the direction the pose "points" to?

Like this?

image
mintar commented 3 weeks ago

Yes, you could say that. Although I think it's not good to think of a pose as an arrow like in your image, because you lose the information how the pose is rotated around the x axis. It's better to think of a pose as a local coordinate system with its own x, y and z axes, like this:

grafik

Some tools (like RViz) allow you to switch the pose visualization mode between "axes" and "arrow", and by convention the arrow always points into the direction of the x axis, so your image is not wrong.

ryan-roche commented 3 weeks ago

I was about to say, isn't the convention that positive-X is the "forwards" direction? That's what I meant by the green arrow on the x-axis of the pose coordinate system.

mintar commented 3 weeks ago

Yes, that's exactly right. All I was saying is that an arrow is ambiguous (it shows the "forwards" direction, but not the "up" direction), not that it's wrong!