Closed christophfroehlich closed 8 months ago
I'm not sure calling them unstable
is the right word. How about rolling compatibility
on iron/humble?. If needed, We can also add unsupported
keyword to the naming.
What do you think?
I was looking for a term not used in the ROS ecosystem (like testing), but unstable is something typical in the debian world. But I'm open for other namings. But for sure it will be unstable :D
@saikishor any idea why building controller_manager fails here? https://github.com/ros-controls/ros2_control_ci/actions/runs/8133285062/job/22224687946?pr=30
This happened to me earlier in my machine, I believe the reason is it has an installed version of ros2_control_test_assets
that is causing the issue. I solved it in my local machine by uninstalling the installed version
Ah, you are right. It is installed with gazebo_ros2_control (see question in initial post)
executing command [apt-get install -y -qq ros-humble-gazebo-ros2-control]
Setting up ros-humble-ros2-control-test-assets (2.39.1-1jammy.20240217.030243) ...
Setting up ros-humble-control-msgs (4.4.0-1jammy.20240217.061208) ...
Setting up ros-humble-realtime-tools (2.5.0-1jammy.20240217.075900) ...
Setting up ros-humble-controller-manager-msgs (2.39.1-1jammy.20240217.051450) ...
Setting up ros-humble-hardware-interface (2.39.1-1jammy.20240217.081024) ...
Setting up ros-humble-controller-interface (2.39.1-1jammy.20240217.081539) ...
Setting up ros-humble-controller-manager (2.39.1-1jammy.20240217.082030) ...
Setting up ros-humble-gazebo-ros2-control (0.4.6-1jammy.20240217.083418) ...
But do we have an issue with overriding the binary version?
And at least one test fails on iron :(
[ RUN ] TestLoadController.spawner_test_type_in_param
...
terminate called after throwing an instance of 'std::runtime_error'
what(): Can not get command interface configuration until the controller is configured.
Ah, you are right. It is installed with gazebo_ros2_control (see question in initial post)
Ideally, if we use the upstream ros2_control, ideally it should work directly. We need to test it. In worst case, we can skip this as you suggested as we don't have to add compatibility with it
But do we have an issue with overriding the binary version?
I believe yes, we might have an issue with the overriding binary versions. I think it is better to do it without any installed versions.
And at least one test fails on iron :(
[ RUN ] TestLoadController.spawner_test_type_in_param ... terminate called after throwing an instance of 'std::runtime_error' what(): Can not get command interface configuration until the controller is configured.
That's weird, it needs to work. I can take a look later
@christophfroehlich Can't we use reusable builds in this case or
Iron
andHumble
?
sorry, what do you mean? Creating workflows in all our repos to build it there? sure, but we didn't want it to run on PR or pushs, so why not to make it centralized here?
sorry, what do you mean? Creating workflows in all our repos to build it there? sure, but we didn't want it to run on PR or pushs, so why not to make it centralized here?
Sure!
BTW I've tested building the gazebo_ros2_control upstream version in the workspace, it seems to work directly without any modifications.
OK. but now also the humble workflow fails because of the binary installation of the test assets. I'll have a look these days, how I could fix this
Sure. Thank you
@saikishor humble tests are failing because of this incompatibility https://github.com/ros-controls/ros2_control/pull/913#issuecomment-1404210606 can one define conditionals for python imports as well?
@saikishor humble tests are failing because of this incompatibility https://github.com/ros-controls/ros2_control/pull/913#issuecomment-1404210606 can one define conditionals for python imports as well?
I'm sorry about it. I didn't run the tests locally. I didn't think about the python part. You are right. If I find some time, I'll get onto it
The humble jobs fail because of gz_ros2_control, but this is not included anywhere. I think this is imported from the cache because the normal stack build and this job share the cache? not sure about the github.run_id
, but the rest is the same: target_ws-humble--testing-reusable_industrial_ci_with_cache--8256893443
https://github.com/ros-controls/ros2_control_ci/blob/7c655929b80445962013564f954b20b2bebdfc3c/.github/workflows/reusable-industrial-ci-with-cache.yml#L67
https://github.com/ros-controls/ros2_control_ci/blob/7c655929b80445962013564f954b20b2bebdfc3c/.github/workflows/reusable-industrial-ci-with-cache.yml#L82
I don't want to test gz/ign_ros2_control on humble here. (see the errors we have with the normal stack build). @fmauch any suggestion how to fix this?
It is weird that it is caching gz_ros2_control
package and running tests, strangely it is hard to reproduce locally on a clean humble docker. It would be nice to figure it out and fix it before merging this
The humble jobs fail because of gz_ros2_control, but this is not included anywhere. I think this is imported from the cache because the normal stack build and this job share the cache? not sure about the
github.run_id
, but the rest is the same:target_ws-humble--testing-reusable_industrial_ci_with_cache--8256893443
I don't want to test gz/ign_ros2_control on humble here. (see the errors we have with the normal stack build). @fmauch any suggestion how to fix this?
Yes, that's definitively a problem. The target-ws is restored by the key
target_ws-${{ env.CACHE_PREFIX }}-${{ hashFiles('**/CMakeLists.txt', '**/package.xml') }}
with ${{ env.CACHE_PREFIX }}
being
${{ inputs.ros_distro }}-${{ inputs.upstream_workspace }}-${{ inputs.ros_repo }}-${{ github.job }}
resulting in
target_ws-humble--main-reusable_industrial_ci_with_cache--8261042993
Given the --
the github.job
seems to be empty, so as soon as another cache was built without touching a CMakeLists.txt
or package.xml
that will be reused.
I don't quite understand, why it is empty, though, I would expect it to be stack-build-on-humble
since that's also the name shown on the Github UI.
Edit: Re-reading the documentation makes that clear:
The job_id of the current job. Note: This context property is set by the Actions runner, and is only available within the execution steps of a job. Otherwise, the value of this property will be null.
Since we're not inside a step when defining the prefix it is empty.
Edit2: Thinking about this that might be a larger problem. That means that an action run from a PR might re-use the cache from the master branch as long (as long as neither a CMakeLists.txt
or package.xml
are touched). I think, github.ref
should get in there, too.
No, I think the github.job
is reusable_industrial_ci_with_cache
: It takes the job from the reusable workflow, not that from the calling workflow.
Edit2: Thinking about this that might be a larger problem. That means that an action run from a PR might re-use the cache from the master branch as long (as long as neither a
CMakeLists.txt
orpackage.xml
are touched). I think,github.ref
should get in there, too.
but using github.ref
would mean that PRs don't use the cache from other branches. And if we have a restore-key without github.ref, then we would have the same issue as in this PR.
No, I think the
github.job
isreusable_industrial_ci_with_cache
: It takes the job from the reusable workflow, not that from the calling workflow.
Not sure, but either way if the job id isn't available... github.workflow
could be an alternative.
Maybe we could use a composable action instead of a reusable workflow? Then the job-id would be unique as we would like it.
but using
github.ref
would mean that PRs don't use the cache from other branches.
True, but do we want that? I guess, we want that for PR runs, right? We could add github.base_ref
for those.
And if we have a restore-key without github.ref, then we would have the same issue as in this PR.
The restore key obviously has to go hand-in hand.
Maybe we could use a composable action instead of a reusable workflow? Then the job-id would be unique as we would like it.
That does indeed not sound too bad...
but using
github.ref
would mean that PRs don't use the cache from other branches.True, but do we want that? I guess, we want that for PR runs, right? We could add
github.base_ref
for those.
PRs targeting master should use the cache from the master branch, but not the other way round. Gitlab does not use caches from PRs, how does that work with github?
base_ref is what we want for PRs, but how would that work with scheduled builds or push events?
The base_ref or target branch of the pull request in a workflow run. This property is only available when the event that triggers a workflow run is either pull_request or pull_request_target.
Maybe we could use a composable action instead of a reusable workflow? Then the job-id would be unique as we would like it.
That does indeed not sound too bad...
if the jobs are unique for different branches, then this would solve the case from above?
if the jobs are unique for different branches, then this would solve the case from above?
Not necessarily. Caches from PRs and scheduled builds would still get mixed up. That might be fine in most cases, but might as well lead to strange errors as the one mentioned above.
I guess, the cleanest would be:
github.workflow
)...${{ github.base_ref }}-${{ github.ref }}
...${{ github.base_ref }}
...${{ github.ref }}
This way on a PR it would search for a cache e.g. ...master-my_feature
and if it doesn't find that during the first build use ...master
, while for a scheduled build it would use ...master
, since ...master-
and ...
don't exist.
That could work, right?
Changing things to actual actions rather than reusable workflow could be a separate discussion, I think.
if the jobs are unique for different branches, then this would solve the case from above?
Not necessarily. Caches from PRs and scheduled builds would still get mixed up. That might be fine in most cases, but might as well lead to strange errors as the one mentioned above.
I guess, the cleanest would be:
* Use a proper job identifier (e.g. `github.workflow`) * Use three possbile restore keys: * `...${{ github.base_ref }}-${{ github.ref }}` * `...${{ github.base_ref }}` * `...${{ github.ref }}`
This way on a PR it would search for a cache e.g.
...master-my_feature
and if it doesn't find that during the first build use...master
, while for a scheduled build it would use...master
, since...master-
and...
don't exist.That could work, right?
not sure, what will be github.workflow
from within the reusable workflow?
The name of the workflow. If the workflow file doesn't specify a name, the value of this property is the full path of the workflow file in the repository. from https://docs.github.com/en/actions/learn-github-actions/contexts
Changing things to actual actions rather than reusable workflow could be a separate discussion, I think.
Maybe it just solves the problem above, because the github.job then names the calling job.
not sure, what will be
github.workflow
from within the reusable workflow?
Indeed that's an interesting question, but it should at lease be available in contrast to the job's name.
Maybe it just solves the problem above, because the github.job then names the calling job.
It will still not be available outside of the steps, right? Or will it because it is part of the step of the calling job?
I guess it is time to prototype something up to verify, I probably won't find time for that before the weekend.
I added gz_ros2_control now for the humble build, and it succeeds now as well. I'd merge this PR if ok for you @fmauch @saikishor, or do you have other naming suggestions? I'll open an issue for the cache topic.
Nice great job guys. CI is passing now!
Accompanying
Any better ideas for the names of the workflows?
I don't think we want to fight the releases of ignition/gz on humble. Therefore, I excluded ign/gz_ros2_control