ros-navigation / navigation2

ROS 2 Navigation Framework and System
https://nav2.org/
Other
2.3k stars 1.2k forks source link

Rolling / Docker release #2648

Open Timple opened 2 years ago

Timple commented 2 years ago

Feature request

Feature description

Bloom-release the master branch to rolling. As the master is now targetted towards rolling.

Implementation considerations

For people targetting rolling but having no interest (yet) in compiling nav2 from source, it would save lots om compilation effort in workspaces and CI if one could apt install ros-rolling-nav2-etc. Of course one still can build from source by cloning the repository in their workspace.

Additional considerations

Perhaps this could even be an automated process.

SteveMacenski commented 2 years ago

Historically, I've thought against doing this since it does add some additional administrative overhead, but I think I need to take some time and reassess that position since alot of our process has changed since then.

I wanted to avoid email dumps on build failures for small API changes requiring me to aggressively fix an issue and push an update. Since Nav2 has about two dozen packages, I'd be getting hundreds of emails over a week when a API breaking change was made. Obviously we'd fix it, but it puts that burden 100% onto me and very immediately (interrupting whatever I was doing). The build farm notification system for build failures make sense if I mess up with an established distro that I shouldn't expect to randomly start failing, but for Rolling where changes are expected, its too aggressive.

I can certainly appreciate the user convenience that would provide, but I'm also trying to balance maintenance with new development with such a small core contribution team.

Timple commented 2 years ago

Well, those are some valid points.

I'd be getting hundreds of emails over a week when a API breaking change was made.

Won't you get lots of github-issues anyhow since the master is broken in this case? Since the policy now states that master targets rolling.

I can certainly appreciate the user convenience that would provide, but I'm also trying to balance maintenance with new development with such a small core contribution team.

I'll leave that descision up to you 🙂

SteveMacenski commented 2 years ago

Won't you get lots of github-issues anyhow since the master is broken in this case?

Far less than 24+ a day! And usually someone will actually just make the quick fix and submit a PR, but only after apt updating, so its less immediate. I don't mind a notification of the error to fix, but I want single-digit number of notifications, not flooding my entire inbox every few hours.

@nuclearsandwich: I'm curious, would a change to Rolling failure notifications be possible? Rolling is unique among distributions where notifications about failures due to upstream packages should be less liberal since we expect failures and them to potentially happen often. Perhaps an email regarding a repo (e.g. 1 rosdistro entry with N packages) a day? If the notifications were more reasonable, I'd gladly release to Rolling as I have done with other mono-repos like robot localization and slam toolbox. For my sanity, I can't wake up to 30 emails in my inbox on each failure due to a namespace change or something silly.

nuclearsandwich commented 2 years ago

@nuclearsandwich: I'm curious, would a change to Rolling failure notifications be possible? Rolling is unique among distributions where notifications about failures due to upstream packages should be less liberal since we expect failures and them to potentially happen often. Perhaps an email regarding a repo (e.g. 1 rosdistro entry with N packages) a day? If the notifications were more reasonable, I'd gladly release to Rolling as I have done with other mono-repos like robot localization and slam toolbox. For my sanity, I can't wake up to 30 emails in my inbox on each failure due to a namespace change or something silly.

As someone who gets emails (pre-sorted into labels and largely not in my inbox) for all build farm failures across all distributions I completely agree with your point, the deluge is not as helpful. However, the build farm emails come directly from Jenkins which doesn't support batching or digesting via the existing email plugins I know of. I think that you could add source entries for navigation2 without doing a full bloom into Rolling. This would give you the single CI job for navigation2 based on ROS 2 Rolling binaries for its dependencies and thus, one email per build failure. However, it would mean that there would be no binary packages in Rolling for navigation2 which wouldn't necessarily meet @Timple's request.

Another option that would work today would be to release into Rolling and then consign any build farm emails referencing Rbin jobs to /dev/null via filter. But that's less than ideal. I've just been grepping through the ros_buildfarm source and conferring with @cottsay and it does not seem like disabling email notifications for binary jobs on a per-repository basis is currently possible but it's something that we may add based on feedback like this. However I can't commit to an exact timeline for when we'd be able to implement that and deploy it on the build farm. But the feedback is definitely heard and it's something we'll look at. I do think that it's likely to be an all-or-nothing setting, either per-package or per-repository rather than being able to digest up emails without resorting to a custom mailer implementation.

SteveMacenski commented 2 years ago

However, it would mean that there would be no binary packages in Rolling for navigation2

Yeah that wouldn't help us, we already have CI setup with some heavy caching thanks to @ruffsl with Rolling without the ROS build farm. I think @Timple really wants easy access binaries.

Another option that would work today would be to release into Rolling and then consign any build farm emails referencing Rbin jobs to /dev/null via filter. But that's less than ideal.

Can you expand on that a little more? Where would we setup this filter? Sounds like this may be the best middle-road solution. Not really ideal, but would meet everyone's needs. I could add in a field for Rolling in the Nav2 readme table of build statuses that way even without emails there would be a place to see it (less emails I assume I'd get from people anyway if I didn't notice).

I CCed @nuclearsandwich on if it were possible, but doesn't seem like it is right now. If I were to better generalize the request if you were going to spend some time and actually build something new for this, it would be:

Then that could be made default ON for Rolling and OFF for actual releases.

If there are failures in rolling its "more OK" than in other situations, since that's kind of the point of rolling. I think the notification / seriousness of it should be diluted a bit. Or hell, I'd also take the following instead:

Timple commented 2 years ago

emails referencing Rbin jobs to /dev/null via filter

Sounds like a personal email rule which simply trashes the rolling nav2 jobs. That, together with this:

a field for Rolling in the Nav2 readme table of build statuses

would be kind of similar to the current status. Upon build failure @SteveMacenski won't see any automated mails. But people noticing can put in a github issue or even a PR with a fix.

nuclearsandwich commented 2 years ago

SteveMacenski: Can you expand on that a little more? Where would we setup this filter?

Timple: Sounds like a personal email rule which simply trashes the rolling nav2 jobs. That, together with this:

Yeah, I could help provide filter templates for Google Mail, procmail, or Sieve which would allow the navigation maintainers to ignore received messages regarding navigation2 packages or any Rolling binary jobs while still receiving other build farm emails. But they would have to be applied and maintained by individual maintainer.

If I were to better generalize the request if you were going to spend some time and actually build something new for this.

Personally, email administration and management is one of my more perverse and arcane hobbies and so building out a more robust and application-specific buildfarm mailer is tempting but I don't think it's something I can spend Open Robotics time on and I won't make any public commitments using my own personal time.

What is readily achievable within the current ros_buildfarm Jenkins automation is a config flag in either the distribution.yaml file in ros/rosdistro or in the release-build.yaml file in ros2/ros_buildfarm_config that overrides the default maintainer_emails on/off setting for each release-build file. The end result would be that personal email filters would not be required for maintainers who don't wish to get Rolling binary message failures but it would still be an all or nothing option rather than a bulk or digest option.

SteveMacenski commented 2 years ago

Ah ok. I suppose that works as well! Totally understand.

So what's the route forward here? Should I release and then get some rule or how do you want to play this?

nuclearsandwich commented 2 years ago

Should I release and then get some rule or how do you want to play this?

Well I actually sat down to export one of my existing gmail filters and write a script that generates a search query for a specific set of packages and due to query length limits in gmail and the fact that partial words are not searchable on that platform. It essentially takes one filter per individual package to capture the jobs for each platform.

Here was a quick script that generated a subject query matching each job name that would need to be filtered for all packages found in the current working directory (which assumes that the package.xml dirname == package name).

If I paste the generated string into gmail for any more than one package at a time it gives me an error.

I can rewrite the script so that it generates a full gmail mailFilters.xml file with one filter for each package but before I get all fancy I want to 1. verify that at least for @SteveMacenski gmail is the correct target mail platform and 2. see how easy it is to suppress maintainer email from being sent rather than just dropping them on receipt.

#!/usr/bin/env bash

JOBS=(
    Rbin_rhel_el864__PACKAGE__rhel_8_x86_64__binary
    Rbin_uF64__PACKAGE__ubuntu_focal_amd64__binary
    Rbin_ufv8_uFv8__PACKAGE__ubuntu_focal_arm64__binary
    Rsrc_el8__PACKAGE__rhel_8__source
    Rsrc_uF__PACKAGE__ubuntu_focal__source
)

PKGS=($(find . -name package.xml | awk -F/ '{ print $(NF - 1) }'))

keywords=()

for pkg in "${PKGS[@]}"; do
    for job in "${JOBS[@]}"; do
        keywords+=("$(printf $job | sed "s:PACKAGE:$pkg:")")
    done
done

echo "subject:(${keywords[@]})"
SteveMacenski commented 2 years ago

Hi,

  1. GMail is the right platform
  2. Anything I can do to help that?
wep21 commented 2 years ago

@nuclearsandwich @SteveMacenski Any updates on this?

nuclearsandwich commented 2 years ago

I haven't been back to this since Steve's confirmation. I'll provide an updated script as it's not trivial to make this alteration in the build farm configurations.

nuclearsandwich commented 2 years ago

Here's a gist containing a script generating the full xml format for exported gmail filters as well as a sample of the generated xml from my local (and several months-out-of-date) clone of navigation2.

https://gist.github.com/nuclearsandwich/174a442ccbae8e066af4e05ce1b26138

wep21 commented 2 years ago

@SteveMacenski Do you have any time to handle the rolling release?

SteveMacenski commented 2 years ago

Its on my queue to look into in March, but we're also still waiting for the Rolling -> 22.04 to be fully completed so we're not battling multiple issues at once. Right now, Nav2's CI is down for the same reason.

wep21 commented 2 years ago

@SteveMacenski I guess ros-rolling-ompl will be available in the next sync. https://discourse.ros.org/t/preparing-for-rolling-sync-2022-03-03/24521 I will also try to build nav2 on Ubuntu 22.04 and fix ci.

ruffsl commented 2 years ago

The docker images for ros:rolling-ros-base-jammy have finally be pushed (there where some upstream issues with rospkg and some unrelated seccomp bumps that slowed things), so we have the base images ready for building CI images:

https://github.com/docker-library/official-images/pull/11917

I've tried building Nav2 locally by commenting out some unreleased leaf dependencies and building ompl from source in our underlay, but encountered another snag with ament:

https://github.com/ompl/ompl/issues/883

As for gazebo, looks like we should try migrating to ignition as well, given Gazebo v11 doesn't currently target Jammy:

https://github.com/ignitionrobotics/ros_ign/issues/219#issuecomment-1053946152

For WIP, see:

https://github.com/ros-planning/navigation2/pull/2838

nuclearsandwich commented 2 years ago

As for gazebo, looks like we should try migrating to ignition as well, given Gazebo v11 doesn't currently target Jammy:

My colleagues will glare daggers at me if I discourage a migration to Ignition so I'm definitely encouraging a migration to Ignition Gazebo as the way forward but I'll point out that Jammy is still missing Ignition Gazebo as well (although we should have the rest of the packages imported soon :tm:). But ROS 2 will be able to use Gazebo 11 from the Ubuntu repositories. https://github.com/ros/rosdistro/pull/31560 is an update to the rosdep keys which should enable this. However Gazebo 11 on Ubuntu Jammy isn't an Open Robotics project and is supported by the Debian and Ubuntu communities.

SteveMacenski commented 2 years ago

We're also a bit blocked on Rolling release on some other issues tangent to our CI build issues with Rolling. This is blocked for the immediate future until we're able to build/test on current rolling in 22.04

Flova commented 2 years ago

Ping @SteveMacenski

I heard the CI issues are resolved for some time at e.g. geometry2.

SteveMacenski commented 2 years ago

This is true, we're still working through some 22.04 related problems though described in https://discourse.ros.org/t/nav2-issues-with-humble-binaries-due-to-fast-dds-rmw-regression/26128. We're still not at a point where it would be wise to release to Rolling.

This is a ticket I frequently visit, it is definitely not forgotten :wink:

Flova commented 2 years ago

Nice :+1:

I just gave the linked theads a read, and it helped my find a bug in our code where most message callbacks of a specific node didn't happen! The node was pretty standard, with the default single threaded executor and stuff. I am on this a few weeks and it turns out it is most likely not our fault as switching to cyclone_dds magically fixed it. The issue is that we cannot use cyclone because of performance issues with the default executor, so we use an executor from IRobot and this seems to be dependent on fastrtps...

Sadly, I am quite frustrated with ROS2 :( , as we are experiencing a severe performance degradation, much boilerplate code and the overall feeling of it being unfinished (e.g. no callback groups for python, which messes with sim time and tf quite a bit).

BUT I am looking forward to nav2 as it seems to be a big improvement compared to movebase. :)

alanxuefei commented 1 year ago

Were any binary versions of Nav2-Rolling released yet? According to the chat history, several issues have blocked the nav2-rolling-release in the past year. Does it mean that the only way to try nav2-rolling is to build from the source code?

SteveMacenski commented 1 year ago

That is currently correct

tonynajjar commented 2 months ago

Time for the yearly update on this issue 🙂 What is still blocking us from having rolling binaries?

SteveMacenski commented 2 months ago

Sigh.

To be honest, the 24.04 migration and the flood of emails I'm getting from all the failures and issues makes this seem like a dead end to me if I had my way - Nav2 has too many dependencies

...

But. Sigh.

Let me run a poll on LinkedIn and/or ROS Discourse about who would use it (with a not-so-hidden-bias to please say no). Now that I'm running Open Navigation I suppose I'm willing to be bullied by popular demand if its not just a really niche group that wants it.

I am somewhat concerned about what happens when we introduce temporary dependencies that aren't available in binary form (i.e. use a repos file to pull in for our CI) since we can't then release Nav2 to binary format with those packages. That could block any further Rolling releases for months while that's being sorted. That's perhaps a problem for another time and I'm making excuses, but a workflow that needs to be considered. I'm not sure if we can have the build farm just ignore one package (but I suspect there's a way)

tonynajjar commented 2 months ago

For me it's about spawning up quickly a rolling environment where I can test stuff on Nav2's main branch. Rolling binaries would help but are not a must - it would just speed up the devcontainer process if we didn't have to build the Nav2 packages.

SteveMacenski commented 2 months ago

Docker? that is on my radar as something I want us to do this year: have a docker image hosted that is intended to be used by other people and a guide attached to it

tonynajjar commented 2 months ago

Yep a docker image that's built every time a PR is merged would be the best imo. I do see that something in thia direction exists but not quite sure what. Do you have an issue created for what you have in mind so I can avoid hijacking this issue? Rolling binairies might still be wanted for other reasons.

SteveMacenski commented 2 months ago

No, I didn't make an issue for it yet. I just renamed the ticket, I generally try not to mix conversations, but this is all kind of a related blend.

I think doing a docker build nightly instead would be better than every time we merge a PR, we could be rebuilding 10x a day on some days. Maybe for established distributions have those rebuild as part of the release process (something I can kick off).

@ruffsl @doisyg how do you feel about this? We used to have pretty good docker containers back before DockerHub changed everything. Worth doing this again using our newer set? I think keeping all the caching and whatnot for our CI is good, but having a simpler base container for release which doesn't need so much complexity (and probably better if users are deriving from it that it doesn't do anything too-too fancy.

Or I mean, hell, we could have the distribution branches be nightly rebuilds too and all it does is pull in Nav2's binaries? I suppose there's no real reason to build from source...

Timple commented 2 months ago

The issue I see with Dockers is het everything tries to be the parent docker.

Here, we provided a Nvidia docker base image just build on top of this

Here, awesome vulcanexus docker for your convenience

Simply use these balena Dockers as base for easy deployments

And here's nav2 Dockers!

How to combine any of these? As far as I can tell that works better with an apt-get

quotes are not literal

SteveMacenski commented 2 months ago

I think in that case its taking the container files and modifying the base to use what you like - you're right having precompiled images doesn't help much after we hit a certain point of complexity and generality. For any one application it can be managed, but for the breadth of all the things anyone might want to do with it, that's not setting realistic expectations. But also the binary distribution method isn't without its own faults. I don't think any of these are silver bullets for everyone, all the time.

I'm usually a perfectionist, but on distribution I understand many people are going to compile and distribute in their own niche ways so I try not to overoptimize for it. But I do want to offer good and reasonable options knowing what common workflows are, within reason. Offering an approachable docker file and image for each ROS distribution seems easy enough - especially if we're just basing from a ROS docker file & then installing binaries. Honestly its so simple, but its an ask I get a couple times a year

Timple commented 2 months ago

Yes, I agree a lot of folks will already have something in place. That's why I would go with what is in line with other distros: binaries. Doing the same means minimal changes for all systems.

Of course additional dockers don't interfere...

Sigh. To be honest, the 24.04 migration and the flood of emails

I completely get this! Rather no binaries than a Steve which doesn't develop anymore due to email floods 🙂

Perhaps someone steps up. Or redirect rolling emails to /dev/null

doisyg commented 2 months ago

@ruffsl @doisyg how do you feel about this?

Following and trusting Ruffin on that one. If Ruffin needs/wants to contribute and help through Dexory, I am all for it