Node does not work with galactic?

christianrauch / apriltag_ros

ROS2 node for AprilTag detection

MIT License

128 stars 85 forks source link

Node does not work with galactic? #11

Closed MihailV1989 closed 6 months ago

MihailV1989 commented 1 year ago

First of all, thanks for the ROS2 port!

I've pulled today the current version and noticed, that apriltag_ros cannot be build any more. I'm using ROS2 Galactic on Ubuntu 20.04.5 LTS on virtual machine and I got the following error:

--- stderr: apriltag_ros                         
CMake Error at CMakeLists.txt:25 (find_package):
  Could not find a configuration file for package "apriltag" that is
  compatible with requested version "3.3".

  The following configuration files were considered but not accepted:

    /opt/ros/galactic/share/apriltag/cmake/apriltagConfig.cmake, version: 3.2.0

In the Galactic central index there isn't still apriltag 3.3: https://github.com/ros/rosdistro/blob/master/galactic/distribution.yaml so I wonder where can I get it?

Until then, I found out, that I can remove the version from CMakeLists.txt line 25 like that: find_package(apriltag REQUIRED)

Only then I got other errors on launch. The initialization of the ROS2 parameters "detector.refine" and "detector.debug" fails even when I do not set the parameters and wrong parameter type error appears. Here the one for "detector.refine", they are both identical:

[INFO] [launch]: All log files can be found below /home/mihail/.ros/log/2022-09-27-22-53-49-985567-myVB-20693
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [component_container-1]: process started with pid [20705]
[component_container-1] [INFO] [1664312030.711422719] [apriltag.tag_container]: Load Library: /home/mihail/thermalcamerafaultdetection/ROS2_Workspace/install/apriltag_ros/lib/libAprilTagNode.so
[component_container-1] [INFO] [1664312030.884578613] [apriltag.tag_container]: Found class: rclcpp_components::NodeFactoryTemplate<AprilTagNode>
[component_container-1] [INFO] [1664312030.884656263] [apriltag.tag_container]: Instantiate class: rclcpp_components::NodeFactoryTemplate<AprilTagNode>
[component_container-1] [ERROR] [1664312030.917002066] [apriltag.tag_container]: Component constructor threw an exception: parameter 'detector.refine' has invalid type: Wrong parameter type, parameter {detector.refine} is of type {bool}, setting it to {integer} is not allowed.
[ERROR] [launch_ros.actions.load_composable_nodes]: Failed to load node 'apriltag' of type 'AprilTagNode' in container '/apriltag/tag_container': Component constructor threw an exception: parameter 'detector.refine' has invalid type: Wrong parameter type, parameter {detector.refine} is of type {bool}, setting it to {integer} is not allowed.
^C[WARNING] [launch]: user interrupted with ctrl-c (SIGINT)
[component_container-1] [INFO] [1664312600.778612133] [rclcpp]: signal_handler(signal_value=2)
[INFO] [component_container-1]: process has finished cleanly [pid 20705]

I managed to fix the error by reverting the data type back to int, as it was before commits on Aug 28, 2022. Does this has something to do with the fact, that I'm not using the newest version 3.3? The parameters are being initialized from the "apriltag_detector_t* const td".

Then as a side question, I was wondering why the more precise pose estimation from the apriltag library is not implemented. The pose estimation from the homography has very low precision that is almost unusable, isn't? I've took the time to implement it as an optional setting that is turned off by default and I think it could be useful to others as well. I can gladly contribute it if you want: https://github.com/MihailV1989/apriltag_ros

Only I cannot guarantee there aren't any hidden bugs and that the code is optimized well as I have no experience with C++. The more precise pose estimation can be turned on by the "refine-pose" parameter that I saw in a very old launch.py file few months ago.

christianrauch commented 1 year ago

The API of the apriltag library changed and we incremented the version for that. That version has not been released to the official ROS repos yet. You will have to build it from source. The node requires this new version. If you just remove the version check, then you will get compilation errors.

I haven't implemented another pose estimation method as I didn't have the need for this. If you found a better solution, you can send a PR with some comparison results (speed, accuracy) and I will review it.

MihailV1989 commented 1 year ago

Thanks for the fast reply! In fact I'm not getting any compilation errors after I removed the version check. Only the wrong parameter type errors on launch. With the fixes I mentioned above the node is working correctly, so I'll then just wait for the official ROS repos release.

As for the pose estimation method, I didn't found a new solution, I just implemented the the official method that is described in the AprilTag Wiki: https://github.com/AprilRobotics/apriltag/wiki/AprilTag-User-Guide#pose-estimation

The accuracy gain of the pose estimation is quite obvious and the exact method is documented in a paper. This is from apriltag_pose.h that is used for the pose estimation:

  Estimate pose of the tag. This returns one or two possible poses for the
  tag, along with the object-space error of each.

  This uses the homography method described in [1] for the initial estimate.
  Then Orthogonal Iteration [2] is used to refine this estimate. Then [3] is
  used to find a potential second local minima and Orthogonal Iteration is
  used to refine this second estimate.

  [1]: E. Olson, “Apriltag: A robust and flexible visual fiducial system,” in
       2011 IEEE International Conference on Robotics and Automation,
       May 2011, pp. 3400–3407.
  [2]: Lu, G. D. Hager and E. Mjolsness, "Fast and globally convergent pose
       estimation from video images," in IEEE Transactions on Pattern Analysis
       and Machine Intelligence, vol. 22, no. 6, pp. 610-622, June 2000.
       doi: 10.1109/34.862199
  [3]: Schweighofer and A. Pinz, "Robust Pose Estimation from a Planar Target,"
       in IEEE Transactions on Pattern Analysis and Machine Intelligence,
       vol. 28, no. 12, pp. 2024-2030, Dec. 2006.  doi: 10.1109/TPAMI.2006.252

  @outparam err1, pose1, err2, pose2

Then in estimate_tag_pose() in apriltag_pose.c the pose with the smaller error is returned and this is the function that I've used. It would be also useful to publish the final object-space error, but for this a new msg will be needed.

christianrauch commented 1 year ago

With the fixes I mentioned above the node is working correctly, so I'll then just wait for the official ROS repos release.

Alternatively, you can just check out the library in your workspace and compile it together with the node.

As for the pose estimation method, I didn't found a new solution, I just implemented the the official method that is described in the AprilTag Wiki:

I just meant that if you find that this works better and is more accurate, you can send a PR with that proposed implementation in the node. I can review it, but it would be good to have some comparison in this PR that shows that the alternative implementation works better in speed and/or accuracy.

MihailV1989 commented 1 year ago

Will do. In the mean time I think I found an issue with the pose estimation. The node is using the camera intrinsic parameters, specifically the matrix P, without adjusting them to the actual image resolution. The CameraInfo message provides the original calibration resolution and corresponding intrinsic parameters: http://docs.ros.org/en/melodic/api/sensor_msgs/html/msg/CameraInfo.html

They have to be then scaled proportionally to the scale of the actual image used for pose estimation relative to calibration resolution: https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html#MathJax-Element-78-Frame

I've implemented this as well in my fork: https://github.com/MihailV1989/apriltag_ros/commit/a7c4d8b77c0e7152346be93b01b9a39fabee5a88

So far it's working correctly and now I've tested it on a Raspberry Pi 4 with 1GB RAM and headless Ubuntu. I'll provide performance test results in the next days.

christianrauch commented 1 year ago

They have to be then scaled proportionally to the scale of the actual image used for pose estimation relative to calibration resolution:

Is this related to quad_decimate in the library and detector.decimate in the node? The image dimensions in the image you get from the camera should match the values in the CameraInfo message. Is there a case where the processed image dimension differs from the received image?

MihailV1989 commented 1 year ago

I would say that it would be inconvenient if you are stuck with the original calibration resolution. If you're using a third party camera image publisher and you calibrate the camera at say full resolution, then you have to manually scale the intrinsic parameters every time you change the resolution. In my case I'm starting a third party publisher node and I have to give the path to a .yaml file where the intrinsic camera parameters are saved and I could edit the file according to the set resolution. But this is only possible at start and with time a calculation error will accumulate. So I don't find it a good solution as you won't be able to experiment easily with different resolutions.

Then of course is the question why not to use always full resolution. Normally you would like to find the optimal resolution for your application so that you don't overload your system. In my case the main limiting factor is the 1GB RAM and the camera bandwidth. But probably there are other use cases where you don't want to use up your resources for publishing big images at high FPS and later decimate the images.

Then about the decimate function. In the case that you're interested in the pose estimation capabilities, it does not make sense to reduce the resolution only for the quad detection that is then used for the pose estimation and still use the full resolution for decoding the binary payload. I don't know how much computational intensive the decoding is, but when I run it on a hardware with limited resources I don't want it to perform better than needed. When a tag is moving away from the camera, the pose estimation becomes unusable much sooner than the decoding.

Maybe all this is not a big deal, but scaling the intrinsic camera parameters when the actual image resolution differs from the calibration resolution is even less problematic and makes the node more foolproof, isn't?

christianrauch commented 1 year ago

I would say that it would be inconvenient if you are stuck with the original calibration resolution. If you're using a third party camera image publisher and you calibrate the camera at say full resolution, then you have to manually scale the intrinsic parameters every time you change the resolution. In my case I'm starting a third party publisher node and I have to give the path to a .yaml file where the intrinsic camera parameters are saved and I could edit the file according to the set resolution. But this is only possible at start and with time a calculation error will accumulate. So I don't find it a good solution as you won't be able to experiment easily with different resolutions.

I am still not sure what the exact problem is. What do you mean with being "stuck with the original calibration resolution"? Every camera setting, such as image dimension, zoom, lense, etc. will have a dedicated set of intrinsic parameters.

When a ROS node publishes images, it also has to publish the corresponding intrinsics. When this node publishes a scaled version of the image, it also has to scale the intrinsics accordingly. You will see this with cameras supporting different image resolutions. Most of them will have dedicated intrinsics for every setting. Otherwise, you will have to calibrate them manually for every setting you use. It's the responsibility of the "sender" (e.g. a camera or generic image publisher) to make sure that image data and intrinsics are correct and match. The "receiver" (e.g. this apriltag node) cannot know how the images are scaled. It just relies on the image data and intrinsics.

Then about the decimate function. In the case that you're interested in the pose estimation capabilities, it does not make sense to reduce the resolution only for the quad detection that is then used for the pose estimation and still use the full resolution for decoding the binary payload.

It depends on your application of course. If there is only a single image consumer/subscriber then publishing at a lower resolution makes sense. But if there are multiple consumers, you force them all to use the lower resolution, even though a higher resolution would be possible. So this is entirely up to your setting and you cannot easily generalise this. Some people will only want to reduce the resolution for the AprilTag detection, some will want to have a lower resolution for their entire setup. In any case, you have to provide the corresponding intrinsics.

Maybe all this is not a big deal, but scaling the intrinsic camera parameters when the actual image resolution differs from the calibration resolution is even less problematic and makes the node more foolproof, isn't?

If the image source is publishing scaled images it also has to scale the intrinsics accordingly. It's not the responsibility of the receiver to figure out scaling and adapt the intrinsics. It is much more foolproof when the image data and correct intrinsics are published by the sender.

MihailV1989 commented 1 year ago

So, I reverted back the last changes with the automatic adjustment of the camera intrinsic parameters.

And also manged to do few simple tests. I ran a preview on the detected tags on live camera image.

As mentioned in the beginning, I'm using ROS2 Galactic on Ubuntu 20.04.5 LTS on virtual machine and for video capture I'm using a OAK-D camera from luxonis and the performance is rather bad. I'm using a HP ZBook 15 G2 with Intel(R) Core(TM) i7-4810MQ CPU @ 2.80GHz with 4 of 8 threads and 6GB RAM assigned to the virtual machine.

Tags have 20 mm width, no decimation and no blur set for the AprilTag detector. I've tried with and without the more precise pose estimation, with 1 or 4 threads and at both 1280x720 and 960x540 image resolution. Results:

| refine-pose | detector.threads | width | height | detections topic avg. hz -- | -- | -- | -- | -- | -- 1 | TRUE | 1 | 1280 | 720 | 0,9 2 | FALSE | 1 | 1280 | 720 | 0,9 3 | TRUE | 4 | 1280 | 720 | 1,0 4 | FALSE | 4 | 1280 | 720 | 0,9 5 | TRUE | 1 | 960 | 540 | 1,6 6 | FALSE | 1 | 960 | 540 | 1,7 7 | TRUE | 4 | 960 | 540 | 1,5 8 | FALSE | 4 | 960 | 540 | 1,5

Unfortunately at that small frame rate it is not possible to say which algorithm is faster/slower. Also, it is rather strange, that there isn't really an improvement with 4 threads. Maybe running detection on image series will be much faster, but I cannot find a image publisher for ROS2 that also publishes camera info so I would have to write them by myself.

The better accuracy is clearly visible on the images by the naked eye, so I would say that there is no need for exact precision comparison. Here are the images in the same order as the table above:

Let me know if this is enough as a comparison.

christianrauch commented 1 year ago

I lost track of the actual problem since my last comment was from more than a month ago :-)

Can you clearly state what the problem with the node is? Could you show the actual results from the node and the standalone apriltag library and then state what the expected outcome would be?

What are you comparing here? Are those results on the same image with different estimation settings, different intrinsics or different implementations for the pose estimation? Where does the higher accuracy come from?

MihailV1989 commented 1 year ago

Can you clearly state what the problem with the node is?

The problem is that the more accurate pose estimation from the apriltag library is not implemented: https://github.com/AprilRobotics/apriltag/wiki/AprilTag-User-Guide#pose-estimation and that the current one performs poorly.

Could you show the actual results from the node [...]

The apriltag_ros node publish as a result of the detection a tf2_msgs/TFMessage that provides the translation and rotation of all the detected tags. This information is not very useful unless you know the actual translation and rotation of the tags. I could conduct such a test, but it would cost me much more time and much more documentation. Instead I'm using the information from the TFMessage to visualize each detected pose as rgb axes and a black cube frame for every tag on the images above. The apriltag_ros node publish as well the corners of the tags in a AprilTagDetection.msg. I'm using this information to visualize the tags' actual edges in yellow on the images above. Thus on every image you can see how well the detected pose of every tag fits on the tag itself.

[...] and the standalone apriltag library and then state what the expected outcome would be?

The standalone apriltag library initially doesn't give you the tag pose, it gives you just the homography which currently is being used in the apriltag_ros node to estimate the tag's pose: https://github.com/christianrauch/apriltag_ros/blob/master/src/AprilTagNode.cpp#L64 But this estimation doesn't provide good results, but rather visibly bad pose estimations, check the 2nd, 4th, 6th and 8th images. For better results the standalone apriltag library provides the estimate_tag_pose function: https://github.com/AprilRobotics/apriltag/wiki/AprilTag-User-Guide#pose-estimation This is what I've added in the apriltag_ros node. It can be optionally used by setting the 'refine-pose' node parameter to true.

What are you comparing here? Are those results on the same image with different estimation settings, different intrinsics or different implementations for the pose estimation? Where does the higher accuracy come from?

I'm comparing the current pose estimation (refine-pose = FALSE, even images 2, 4, 6, 8) with the more accurate pose estimation from the standalone apriltag library (refine-pose = TRUE, odd images 1, 3, 5, 7). Those are not exactly the same images, as I was using live camera footage, but the differences are negligible. I've tested with different image resolution (respectively scaling the intrinsic parameters) and detector threads, but pairwise: eg. 1st & 2nd image have same settings, only different pose estimation is used. The same goes for 3rd & 4th, 5th & 6th, 7th & 8th. The higher accuracy comes from the different pose estimation.

MihailV1989 commented 1 year ago

Now I've tested it on a Raspberry 4 with 1 GB RAM and headless Ubuntu Server. The camera driver works much better so I got higher FPS and now it's clearly visible that the more precise pose estimation is a bit slower. At worse it is about 68 % slower.

I've also used a plain sheet of paper with different tags so that the advantage of the more precise pose estimation is much more obvious. Here again the results from the different test configurations:

| refine-pose | detector.threads | width | height | detections topic avg. hz -- | -- | -- | -- | -- | -- 1 | TRUE | 1 | 1280 | 720 | 3,1 2 | FALSE | 1 | 1280 | 720 | 3,7 3 | TRUE | 4 | 1280 | 720 | 3,4 4 | FALSE | 4 | 1280 | 720 | 3,8 5 | TRUE | 1 | 960 | 540 | 4,0 6 | FALSE | 1 | 960 | 540 | 6,7 7 | TRUE | 4 | 960 | 540 | 4,3 8 | FALSE | 4 | 960 | 540 | 6,6

And the corresponding tag preview images in the exact same order:

img1s img2s img3s img4s img5s img6s img7s img8s

christianrauch commented 1 year ago

I still do not see which results correspond to the current baseline implementation on the master branch in this repo and what you implemented. If you want your improvements to be part of this repo, I suggest that you submit a PR with your changes and the "before" (baseline) and "after" results for comparison, probably with some parameter to switch to either estimation approach. I will be happy to review any improvements to this node :-)

It is hard for me to judge the improvements when I do not have a clear connection between the qualitative results that you show with the actual implementation that was used for the estimation (given the same homography) and also some runtime information.

MihailV1989 commented 1 year ago

It is true that I didn't compared against the actual master branch, but instead I just used the "old" estimation method that is implemented in the master branch and compared it against the "new" method recommended by the standalone apriltag library. So yes, there could be a slight chance that I messed up something. If this is what is bothering you, I will repeat the test.

Aside from that I have no idea at all why you don't understand/accept my comparison =) My improvement does not change the homography result of the detection and I'm not using the homography information to do the comparison. The actual difference can be only seen in the tf2_msgs/TFMessage provided by both "old" and "new" method and more precisely in the pose rotation saved as geometry_msgs/Quaternion message. As a quaternion is not "human readable" (at least not for me), I'm expressing it as a reference system drawn on top of the corresponding tag and thus I'm visualizing the estimated orientation of the tag so you can use common sense to determine if the estimation is reasonable. This same visualization of the estimated orientation is repeated for both the "old" (even images 2, 4, 6, 8) and the "new" (odd images 1, 3, 5, 7) method.

If you want a more direct comparison, I can convert the quaternions to normals and calculate the misalignment (in degrees) of the tag's normal relative to the camera normal vector. Thus you can then compare the misalignment of the tags relative to camera normal using the "old" and the "new" methods and see that the old method has much bigger standard deviation, as there every tag is pointing in different direction. Only thing is, this is already clearly visible from the images above, so I'm really not sure, what kind of comparison you are expecting.

I have never performed a PR, so please don't be harsh on me =) I also don't know what kind of runtime information do you need.

christianrauch commented 1 year ago

Aside from that I have no idea at all why you don't understand/accept my comparison =)

I simply do not see on which implementation you do your comparisons. You show results, but I don't see from which implementation they come from. Which of those images are results from apriltag_ros master branch (given which parameters) and which images are from your implementation (incl. parameters)? How would someone reproduce these results?

I'm not using the homography information to do the comparison

Don't you have to use the homography from the apriltag detection to estimate the 3D pose?

provided by both "old" and "new" method

Is the "old" method the one on my master branch? And the "new" method is the one you show the results for?

I'm expressing it as a reference system drawn on top of the corresponding tag and thus I'm visualizing the estimated orientation of the tag so you can use common sense to determine if the estimation is reasonable. [...] If you want a more direct comparison, I can convert the quaternions to normals and calculate the misalignment (in degrees) of the tag's normal relative to the camera normal vector.

This is not required. If you do not have ground truth for the tag poses to do a qualitative comparison, then the qualitative results with the rendered coordinate frames are totally fine. The only problem I have is that it is not clear which implementation generates which result.

Only thing is, this is already clearly visible from the images above, so I'm really not sure, what kind of comparison you are expecting.

So far, I only saw 8 images but without the information from which implementation they come from.

I have never performed a PR, so please don't be harsh on me =)

That's not a problem. Since you talked about your alternative implementation, I assumed that you wanted to contribute it back via a PR. That would mean that you commit your changes to a new branch in your repo and use the GitHub webinterface to send a pull request (PR) to this repo. I can then see the exact changes you propose and comment on. But if you opened this issue only to tell me that my baseline implementation provides poor results in some cases -- then yes, this is certainly possible but I don't have the time currently to look into alternative approaches.

I also don't know what kind of runtime information do you need.

It would be useful to know if an alternative approach works faster or slower than the current baseline (and by how much). If it is faster, we can replace the old method. If it is slower we probably want to give users a choice between the "accurate but slow" and the "fast but inaccurate" method.

MihailV1989 commented 1 year ago

Thank you for the clarification.

Aside from that I have no idea at all why you don't understand/accept my comparison =)

I simply do not see on which implementation you do your comparisons. You show results, but I don't see from which implementation they come from. Which of those images are results from apriltag_ros master branch (given which parameters) and which images are from your implementation (incl. parameters)? How would someone reproduce these results?

Unfortunately the new apriltag version 3.3 is still not released in the central ROS2 Galactic index, so the apriltag_ros master branch will still not work on my system. That is why I'm testing with a modified version of the master branch. Of course I'll provide the exact parameters, but I would say that they're not very interesting:

    "image_transport": "raw",
    "family": "16h5",
    "size": 0.02088,
    "max_hamming": 0,
    "z_up": False,
    "detector.decimate": 1.0,
    "detector.blur": 0.0,
    "detector.refine-edges": 0,
    "detector.threads": 1 or 4,
    "detector.debug": 0

provided by both "old" and "new" method

Is the "old" method the one on my master branch? And the "new" method is the one you show the results for?

Yes. Both methods are available for use at my branch by switching a parameter on/off: https://github.com/MihailV1989/apriltag_ros

This is the branch I've used for testing. Unfortunately, as I mentioned in the beginning of this issue, I currently cannot use your branch without modifications as I get errors. I'll try to compile the newest AprilTag version 3.3 from source and see if I can get it up and running. Anyway I can and I have to implement the "new" method on a clean copy of the master branch without the bug fixing for getting round the version incompatibility. Then you could easily re-produce my results if you would have the time.

I also don't know what kind of runtime information do you need.

It would be useful to know if an alternative approach works faster or slower than the current baseline (and by how much). If it is faster, we can replace the old method. If it is slower we probably want to give users a choice between the "accurate but slow" and the "fast but inaccurate" method.

I've used the "ros2 topic hz" command to measure the publishing frequency of the "/apriltag/detections" topic and showed the results in my post above: https://github.com/christianrauch/apriltag_ros/issues/11#issuecomment-1312856559

Isn't this what you need?

christianrauch commented 1 year ago

Unfortunately the new apriltag version 3.3 is still not released in the central ROS2 Galactic index, so the apriltag_ros master branch will still not work on my system.

Have you tried building the library and the node from source?

Yes. Both methods are available for use at my branch by switching a parameter on/off

Looking for the first time at the code now, I see that you essentially switch between the old and den new method via refine_pose / refine-pose. I guess this is an important piece of information when looking at your tables. Without making this explicit, it is not possible to see which method generates which result. It is much easier for me now to understand the results in the table.

Then you could easily re-produce my results if you would have the time.

If you manage to implement your changes on top of the master branch without modifications and send a PR for this, then I am happy to review and potentially merge your changes.

Isn't this what you need?

I think for benchmarking, it would be better to temporarily measure the time specifically for the "if (refine_pose) ... else" block since the frequency results on the topic also include the tag detection itself, which did not change between the old and the new method, and all the IPC overhead.

Btw, 3 Hz seems quite low. Did you build in Debug or release mode? On a modern PC, you should get much higher framereates.

christianrauch commented 1 year ago

The AprilTag ROS 2 node has been released to humble and rolling and you should be able to install the node from the official ROS 2 repos after the next sync.

@MihailV1989 Do you plan to submit a PR for your implementation of the estimation method provided with the apriltag library? I intend to close this issue soon as there is nothing wrong with the current implementation.

MihailV1989 commented 1 year ago

Yes, I think this feature will be useful not only to me. I just have to manage to run the image_publisher so I can use the same images for testing both pose estimations. I see you're now using the older apriltag 3.2, so it shouldn't be a problem to run your current ROS2 node version.

a-uhlig commented 1 year ago

I have made the same experience. The computed pose is more accurate using the ROS1 node or the library function.

@MihailV1989 how far are you with the PR? I would be very happy if you provide your improvements. May I can help you!?

christianrauch commented 6 months ago

Duplicate of #14.