Closed JiatengSun closed 6 months ago
Thank you for reporting the issue - I will probably have some time to look at this one next week. I will try to get back to you then.
My initial hunch would be that you perhaps have an older copy of gisnav_msgs
(~/ws_sensor_combined/src/gisnav_msgs
) in your colcon workspace? The message definitions used to be in a separate repo but were moved to the main repo a while ago. It could be that the GitHub pages instructions that were published for the old v0.64 tag of gisnav still instruct downloading the message definitions into src/gisnav_msgs
which would make it incompatible with the gisnav repo master branch which puts them at src/gisnav/gisnav_msgs
- I would have to check if that is the case.
If you want to build the newer message definitions you could try removing ~/ws_sensor_combined/src/gisnav_msgs
from the workspace, colcon would then likely find them at ~/ws_sensor_combined/src/gisnav/gisnav_msgs
.
Thank you for reporting the issue - I will probably have some time to look at this one next week. I will try to get back to you then.
My initial hunch would be that you perhaps have an older copy of
gisnav_msgs
(~/ws_sensor_combined/src/gisnav_msgs
) in your colcon workspace? The message definitions used to be in a separate repo but were moved to the main repo a while ago. It could be that the GitHub pages instructions that were published for the old v0.64 tag of gisnav still instruct downloading the message definitions intosrc/gisnav_msgs
which would make it incompatible with the gisnav repo master branch which puts them atsrc/gisnav/gisnav_msgs
- I would have to check if that is the case.If you want to build the newer message definitions you could try removing
~/ws_sensor_combined/src/gisnav_msgs
from the workspace, colcon would then likely find them at~/ws_sensor_combined/src/gisnav/gisnav_msgs
.
I figured the same thing, I tried to delete src/gisnav/gisnav_msgs
and src/gisnav_msgs
, I also tried to replace src/gisnav/gisnav_msgs
with src/gisnav_msgs
. All three methods give the same error:
ERROR: the following packages/stacks could not have their rosdep keys resolved
to system dependencies:
gisnav_msgs: Cannot locate rosdep definition for [geographic_msgs]
gisnav: Cannot locate rosdep definition for [tf_transformations]
I tried this in a fresh ROS 2 Humble container (docker run -it ros:humble bash
) - seems like on a fresh system it is sufficient to do the following (no sudo
since we are already root in the container):
apt-get update
mkdir -p colcon_ws/src && cd $_
git clone https://github.com/hmakelin/gisnav.git
cd gisnav/gisnav
rosdep update
rosdep install --from-paths . -y -r --ignore-src
Todo:
-r
option for rosdep install
which will make it continue installing the other dependencies even when it finds out gisnav_msgs
is not in the rosdep index (it will get built from the local workspace sources by colcon build
later). The option is only missing from the instructions but not from docker/gisnav/Dockerfile
.Notes:
--include-eol-distros
option to your rosdep
commands since Foxy recently reached its end of life.apt dist-upgrade
and it's also done in the docker/gisnav/Dockerfile
but it might not be required hereI tried this in a fresh ROS 2 Humble container (
docker run -it ros:humble bash
) - seems like on a fresh system it is sufficient to do the following (nosudo
since we are already root in the container):apt-get update mkdir -p colcon_ws/src && cd $_ git clone https://github.com/hmakelin/gisnav.git cd gisnav/gisnav rosdep update rosdep install --from-paths . -y -r --ignore-src
Todo:
* [ ] The current instructions for a local installation are missing the `-r` option for `rosdep install` which will make it continue installing the other dependencies even when it finds out `gisnav_msgs` is not in the rosdep index (it will get built from the local workspace sources by `colcon build` later). The option is only missing from the instructions but not from `docker/gisnav/Dockerfile`.
Notes:
* If you are using ROS 2 Foxy you might also need to add the `--include-eol-distros` option to your `rosdep` commands since Foxy recently reached its end of life. * Some ROS tutorials I have found also do an `apt dist-upgrade` and it's also done in the `docker/gisnav/Dockerfile` but it might not be required here
Interesting, I had always been trying to set up GISNAV on Ubuntu 20.04 (as suggested in the Documentation). I saw the choice between Foxy and Humble on the Documentation page, and always assumed if im on 20.04, I should be working with Foxy. So just to be clear, should I be setting up a Humble environment on a Ubuntu 20.04 machine?
Edit: I just tried to install Humble on 22.04, I think I have to build from source? Or should I switch to Ubuntu 22.04? I also tried your way of setting up GisNav which worked but when I try to keep following the documentation to Download the LoFTR submodule and weights, its giving me: error: pathspec 'LoFTR' did not match any file(s) known to git
Some more context for the Foxy/Humble situation:
The current GitHub pages guide (gisnav.org) for the latest release 0.64 recommends Ubuntu 20.04 and ROS 2 Foxy in line with PX4 recommendations from one year ago when the release was made - that was also my development system back then.
Since then Foxy has reached end of life, but we still need Foxy for the PX4 "Gazebo Classic" SITL simulation. This is because PX4's Gazebo Classic SITL simulation has the Typhoon H480 model with a simulated 3-axis gimbal and video stream that is difficult to replace if you want to make a nice mock GPS demo. PX4 seems to already support Ubuntu 22.04 with their 1.14 release and once the SITL simulation situation is somehow resolved I think GISNav should also go with that version of Ubuntu for the px4
Docker Compose service.
So while the GISNav ROS 2 package itself already works with Ubuntu 22.04 and ROS 2 Humble, for the PX4 SITL simulation we need an Ubuntu 20.04 / ROS 2 Foxy system.
With Docker containers this should not be a major issue - I think GISNav has made a lot of progress in moving all its supporting services from a monolithic single container into dedicated containers since that last 0.64 release was made. These services are all orchestrated via docker/docker-compose.yaml
and docker/Makefile
. For development I've noticed that it's convenient to run all the supporting services inside Docker containers (make start
spins everything up in just a few seconds on my development machine once you have built, created and run your containers at least once) and then run GISNav locally (on the host machine) e.g. with ros2 launch gisnav px4.dev.launch.py
I'm currently working on a major refactoring on the gisnav-gpu
branch but plan to do another pass through the documentation at the end of the refactoring - there's probably a lot of these issues hiding there.
I also tried your way of setting up GisNav which worked but when I try to keep following the documentation to Download the LoFTR submodule and weights, its giving me: error: pathspec 'LoFTR' did not match any file(s) known to git
The 0.64 release had LoFTR
as a git submodule but it has been removed in the current master
branch. You might be cloning the master
branch and not checking out the 0.64.0
tag? You can follow the gisnav.org instructions for the 0.64.0
release, or if you want to grab the latest master
branch you can try generating the latest docs with make docs
. You need to have sourced your colcon workspace and built gisnav in it to be able to build the documentation.
On the current master
branch the deep learning dependency is isolated into its own torch-serve
Docker Compose service. The idea for this was to decouple the difficult-to-setup deep learning dependency chain into a separate service but it ended up introducing too much latency (serialization of each image via HTTPS) so in the gisnav-gpu
branch I am removing the torch-serve
service and bringing the LoFTR model back inside a ROS node so that we can pass the images around via shared memory. There the installation of the model is a bit easier since LoFTR
now comes with kornia
and does not have to be included as a submodule.
Some more context for the Foxy/Humble situation:
The current GitHub pages guide (gisnav.org) for the latest release 0.64 recommends Ubuntu 20.04 and ROS 2 Foxy in line with PX4 recommendations from one year ago when the release was made - that was also my development system back then.
Since then Foxy has reached end of life, but we still need Foxy for the PX4 "Gazebo Classic" SITL simulation. This is because PX4's Gazebo Classic SITL simulation has the Typhoon H480 model with a simulated 3-axis gimbal and video stream that is difficult to replace if you want to make a nice mock GPS demo. PX4 seems to already support Ubuntu 22.04 with their 1.14 release and once the SITL simulation situation is somehow resolved I think GISNav should also go with that version of Ubuntu for the
px4
Docker Compose service.So while the GISNav ROS 2 package itself already works with Ubuntu 22.04 and ROS 2 Humble, for the PX4 SITL simulation we need an Ubuntu 20.04 / ROS 2 Foxy system.
With Docker containers this should not be a major issue - I think GISNav has made a lot of progress in moving all its supporting services from a monolithic single container into dedicated containers since that last 0.64 release was made. These services are all orchestrated via
docker/docker-compose.yaml
anddocker/Makefile
. For development I've noticed that it's convenient to run all the supporting services inside Docker containers (make start
spins everything up in just a few seconds on my development machine once you have built, created and run your containers at least once) and then run GISNav locally (on the host machine) e.g. withros2 launch gisnav px4.dev.launch.py
I'm currently working on a major refactoring on the
gisnav-gpu
branch but plan to do another pass through the documentation at the end of the refactoring - there's probably a lot of these issues hiding there.
Thank you for the detailed explanation, looking forward to the updated documentation!
I also tried your way of setting up GisNav which worked but when I try to keep following the documentation to Download the LoFTR submodule and weights, its giving me: error: pathspec 'LoFTR' did not match any file(s) known to git
The 0.64 release had
LoFTR
as a git submodule but it has been removed in the currentmaster
branch. You might be cloning themaster
branch and not checking out the0.64.0
tag? You can follow the gisnav.org instructions for the0.64.0
release, or if you want to grab the latestmaster
branch you can try generating the latest docs withmake docs
. You need to have sourced your colcon workspace and built gisnav in it to be able to build the documentation.On the current
master
branch the deep learning dependency is isolated into its owntorch-serve
Docker Compose service. The idea for this was to decouple the difficult-to-setup deep learning dependency chain into a separate service but it ended up introducing too much latency (serialization of each image via HTTPS) so in thegisnav-gpu
branch I am removing thetorch-serve
service and bringing the LoFTR model back inside a ROS node so that we can pass the images around via shared memory. There the installation of the model is a bit easier sinceLoFTR
now comes withkornia
and does not have to be included as a submodule.
Of course! I forgot to checkout with the 0.64.0 tag. Thank you!
Another issue came up, when pip installing requirements, pyproj-3.2.1 always raises errors. I previously dealt with this issue by installing an older version of cython (0.26.0) but I can't seem to install older versions of cython on this docker image (which might be a python version issue). I hope the revised document will also include the recommended version of Python!
I have successfully got the docker running, but in QGC the bottom left window is saying waiting for video and its giving me:
qgc-1 | VideoReceiverLog: gst_element_factory_make() for data source failed qgc-1 | VideoReceiverLog: _makeSource() failed qgc-1 | VideoReceiverLog: Failed
I have successfully got the docker running, but in QGC the bottom left window is saying waiting for video and its giving me:
qgc-1 | VideoReceiverLog: gst_element_factory_make() for data source failed qgc-1 | VideoReceiverLog: _makeSource() failed qgc-1 | VideoReceiverLog: Failed
Thank you for finding this issue!
To study this I updated the QGC service to use Ubuntu Jammy instead of Focal and managed to get the video feed displaying in the QGC UI. In this case this was undesirable since with QGC occupying the UDP port, gscam was now unable to bridge that video feed into ROS which is what we actually want. So I might hold off on updating the QGC service to Jammy for now even though it technically fixes this bug.
Generally to troubleshoot these kinds of issues you could check that the video stream is available on your host with gstreamer. Run the following command when the PX4 SITL simulation is running (this is the same launch string that is used by the gscam
service):
gst-launch-1.0 udpsrc port=5600 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264" ! rtph264depay ! avdec_h264 ! videoconvert ! autovideosink
You should be seeing the simulated video feed.
Some commands to help you verify the video stream is being published to ROS by gscam:
ros2 topic list | grep camera
ros2 topic echo /camera/image_raw
ros2 topic echo /camera/camera_info
If you feel like some GUIs or windows are not appearing, you might also want to check that your X server is exposed to your containers, you can do that with
cd docker
make expose-xhost
Or if you are feeling dangerous, you can also just do
xhost +
After running: gst-launch-1.0 udpsrc port=5600 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264" ! rtph264depay ! avdec_h264 ! videoconvert ! autovideosink
I can see the video feed.
So to resolve this for now, should I updated the QGC service to use Ubuntu Jammy instead of Focal? How should I do this?
After running: gst-launch-1.0 udpsrc port=5600 caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264" ! rtph264depay ! avdec_h264 ! videoconvert ! autovideosink
I can see the video feed.
So to resolve this for now, should I updated the QGC service to use Ubuntu Jammy instead of Focal? How should I do this?
I opened another MR to solve this issue here: https://github.com/hmakelin/gisnav/pull/94
To view the video stream in QGC one would just need to configure QGC to use port 5601 instead of 5600. I will next see if I can do that by default in the Docker container.
Sorry but I'm still a little confused, does the bug affect gisnav demo? I successfully started the docker with "docker compose up....", should i run the demo using "make -C docker demo"?
Sorry but I'm still a little confused, does the bug affect gisnav demo? I successfully started the docker with "docker compose up....", should i run the demo using "make -C docker demo"?
This bug should not affect the demo, only the visibility of the video feed in the QGC UI. make -C docker demo
should start all the required Docker Compose services for the demo. I've recently had some issues with ROS shared memory not working with Docker containers so you might also want to check this out if that's also the case for you https://www.gisnav.org/pages/developer_guide/offboard/troubleshooting.html#disable-sharedmemory-for-fast-dds
Sorry but I'm still a little confused, does the bug affect gisnav demo? I successfully started the docker with "docker compose up....", should i run the demo using "make -C docker demo"?
This bug should not affect the demo, only the visibility of the video feed in the QGC UI.
make -C docker demo
should start all the required Docker Compose services for the demo. I've recently had some issues with ROS shared memory not working with Docker containers so you might also want to check this out if that's also the case for you https://www.gisnav.org/pages/developer_guide/offboard/troubleshooting.html#disable-sharedmemory-for-fast-dds
Does sudo docker compose up mapserver px4 micro-ros-agent qgc gisnav
also start the demo? I notice make -C docker demo
is composing everything again.
After running sudo docker compose up mapserver px4 micro-ros-agent qgc gisnav
and running failure gps off
in MavLink console, the drone started descending and I got this error from gisnav:
gisnav-1 | [gis_node-1] [WARN] [1706753676.873355326] [gisnav.gis_node]: Unexpected input argument types for _vehicle_geopose: nav_sat_fix (expected <class 'sensor_msgs.msg._nav_sat_fix.NavSatFix'>, got <class 'NoneType'>), pose_stamped (expected <class 'geometry_msgs.msg._pose_stamped.PoseStamped'>, got <class 'NoneType'>) gisnav-1 | [gis_node-1] [WARN] [1706753676.874889760] [gisnav.gis_node]: Unexpected input argument types for _camera_quaternion: geopose (expected <class 'geographic_msgs.msg._geo_pose_stamped.GeoPoseStamped'>, got <class 'NoneType'>) gisnav-1 | [gis_node-1] [WARN] [1706753676.875957447] [gisnav.gis_node]: Unexpected input argument types for _fov_and_principal_point_on_ground_plane: camera_quaternion (expected <class 'geometry_msgs.msg._quaternion.Quaternion'>, got <class 'NoneType'>), vehicle_pose (expected <class 'geometry_msgs.msg._pose_stamped.PoseStamped'>, got <class 'NoneType'>), camera_info (expected <class 'sensor_msgs.msg._camera_info.CameraInfo'>, got <class 'NoneType'>) gisnav-1 | [gis_node-1] [WARN] [1706753676.877023513] [gisnav.gis_node]: Unexpected input argument types for _orthoimage_size: camera_info (expected <class 'sensor_msgs.msg._camera_info.CameraInfo'>, got <class 'NoneType'>) gisnav-1 | [gis_node-1] [WARN] [1706753676.878332920] [gisnav.gis_node]: Unexpected input argument types for _request_orthoimage_for_bounding_box: bounding_box (expected <class 'geographic_msgs.msg._bounding_box.BoundingBox'>, got <class 'NoneType'>), size (expected typing.Tuple[int, int], got <class 'NoneType'>)
I have managed to successfully run make -C docker demo
But now i get this error and the UAV lands right after I enter failure gps off
in MavLink console.
failure gps off WARN [failure] inject failure unit: gps (4), type: off (1), instance: 0 WARN [simulator_mavlink] CMD_INJECT_FAILURE, GPS off pxh> WARN [mc_pos_control] invalid setpoints WARN [mc_pos_control] Failsafe: blind land WARN [failsafe] Failsafe activated INFO [gimbal] Configured primary gimbal control sysid/compid from 1/1 to 0/0 INFO [tone_alarm] battery warning (fast) ERROR [timesync] Time jump detected. Resetting time synchroniser.
and the terminal is outputing:
gisnav-1 | [gis_node-1] [INFO] [1707028616.316356544] [gisnav.gis_node]: Sending GetMap request for bbox: BBox(left=-122.25607107767837, bottom=37.52134352310396, right=-122.25007690232293, top=37.52611766351677), layers: ['osm-buildings-dem']. gisnav-1 | [cv_node-2] [WARN] [1707028616.911203359] [gisnav.cv_node]: Could not estimate pose, status code 503 gisnav-1 | [cv_node-2] [WARN] [1707028616.912565405] [gisnav.cv_node]: Unexpected input argument types for _post_process_pose: pose (expected typing.Tuple[numpy.ndarray, numpy.ndarray], got <class 'NoneType'>) gisnav-1 | [cv_node-2] [WARN] [1707028616.913595017] [gisnav.cv_node]: Could not complete post-processing for pose estimation gisnav-1 | [cv_node-2] [WARN] [1707028617.598098401] [gisnav.cv_node]: Could not estimate pose, status code 503
Also, the RViz window, which I assume should be the window to display ortho image, is just a black stage with grids.
Nice to hear you got the simulation running!
Unfortunately I currently do not have a solution for the failsafes triggering when you turn off the simulated GPS. Some time after I made the demo video the PX4 EKF was updated so that it no longer worked with the old mock GPS messages - I suspect it's because the EKF now requires a velocity estimate here https://github.com/PX4/PX4-Autopilot/blame/main/src/modules/ekf2/EKF2.cpp#L2375-L2382 which GISNav's MockGPSNode does not compute. I've thought about adding a filter to compute the velocity estimate and variances from the position estimates based on some simple state model in GISNav but no concrete plans yet since these are downstream of what I'm now working on (I'm currently working on setting up Docker bridge networks to isolate related services so that they would not have to share the same host network to have a bit better security and modularity).
On some of the other issues you mentioned:
Nice to hear you got the simulation running!
Unfortunately I currently do not have a solution for the failsafes triggering when you turn off the simulated GPS. Some time after I made the demo video the PX4 EKF was updated so that it no longer worked with the old mock GPS messages - I suspect it's because the EKF now requires a velocity estimate here https://github.com/PX4/PX4-Autopilot/blame/main/src/modules/ekf2/EKF2.cpp#L2375-L2382 which GISNav's MockGPSNode does not compute. I've thought about adding a filter to compute the velocity estimate and variances from the position estimates based on some simple state model in GISNav but no concrete plans yet since these are downstream of what I'm now working on (I'm currently working on setting up Docker bridge networks to isolate related services so that they would not have to share the same host network to have a bit better security and modularity).
On some of the other issues you mentioned:
* For v0.65 the RViz window should display the tf2 transformations tree. There is also an RVizNode that was used earlier to publish the drone flight path in RViz but it will not work for v0.65. * Those warnings about unexpected input argument types you mentioned earlier are expected and you will probably get a lot of them - they are generated by the runtime type narrowing decorator https://www.gisnav.org/pages/api_documentation/private/decorators.html#gisnav._decorators.narrow_types
I see, so if I run v0.65, should the demo be working? What is your suggestion if I want a working demo to play with?
The solution is probably a combination of modifying the PX4 EKF2 related parameters to try to increase tolerances (responsibly and in moderation) to different kinds of variation and delays in the GPS signal, and updating what goes in the MockGPSNode outgoing message. The latter I think requires more thought and will ultimately depend on what version of PX4 you will be targeting in the future. I am unfortunately not working on this yet since I expect there to be multiple releases of PX4 before GISNav is mature enough to spend more development effort on the integration.
I see, but since the demo used to work, is it possible to roll back to an older version of PX4 to fix this issue for now?
I see, but since the demo used to work, is it possible to roll back to an older version of PX4 to fix this issue for now?
I created https://github.com/hmakelin/gisnav/issues/109 to track this issue since it is not related to installation. Please see my response there.
Release 0.67.0 should fix these installation issues and improve the situation with the failsafes triggering in SITL simulation. So I am closing this issue but of course feel free to open a new issue if similar issues appear again. I also recommend rebuilding any Docker images you might have. Thank you for reporting!
~/ws_sensor_combined/src$ rosdep install --from-paths . -y --ignore-src
ERROR: Rosdep experienced an error: Multiple packages found with the same name "gisnav_msgs":
rosdep version: 0.22.2
Traceback (most recent call last): File "/usr/lib/python3/dist-packages/rosdep2/main.py", line 146, in rosdep_main exit_code = _rosdep_main(args) File "/usr/lib/python3/dist-packages/rosdep2/main.py", line 446, in _rosdep_main return _package_args_handler(command, parser, options, args) File "/usr/lib/python3/dist-packages/rosdep2/main.py", line 501, in _package_args_handler pkgs = find_catkin_packages_in(path, options.verbose) File "/usr/lib/python3/dist-packages/rosdep2/catkin_packages.py", line 35, in find_catkin_packages_in packages = find_packages(path) File "/usr/lib/python3/dist-packages/catkin_pkg/packages.py", line 103, in find_packages raise RuntimeError('\n'.join(duplicates)) RuntimeError: Multiple packages found with the same name "gisnav_msgs":