gazebo-tooling / release-tools

8 stars 9 forks source link

Uncontrolled triggering of builds in -ci-pr_any- when being called manually #242

Closed j-rivero closed 11 months ago

j-rivero commented 4 years ago

There is a problem for jobs using integration with github pull requests once called manually that could potentially triggered hundreds of builds depending on the branch passed as parameter. Example: https://build.osrfoundation.org/view/main/view/CI%20ABI%20jobs/job/gazebo-abichecker-any_to_any-ubuntu_auto-amd64/230

Parameters

DEST_BRANCH: gazebo11
SRC_BRANCH: respect_fps_gz11new

Log

Started by user Mabel Zhang
Running as SYSTEM
Building remotely on drogon-aws.nv.xenial (gpu-reliable gpu-nvidia-docker2 large-memory docker) in workspace /var/lib/jenkins/workspace/gazebo-abichecker-any_to_any-ubuntu_auto-amd64
No credentials specified
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/osrf/gazebo.git # timeout=10
Fetching upstream changes from https://github.com/osrf/gazebo.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/osrf/gazebo.git +refs/pull/*:refs/remotes/origin/pr/* # timeout=10
Seen branch in repository origin/2.1_abi_compat
Seen branch in repository origin/2014_copyright
Seen branch in repository origin/90_windows_patch_test
Seen branch in repository origin/DoNotLaunchMarkerManagerInServer
Seen branch in repository origin/Fix_cmake
Seen branch in repository origin/JointCreatorBug
Seen branch in repository origin/SetSceneNodeEmissive
Seen branch in repository origin/accuracy_island_threads
Seen branch in repository origin/accuracy_log
Seen branch in repository origin/actor_animation
Seen branch in repository origin/add_boost_format
Seen branch in repository origin/add_gazebo_libraries
Seen branch in repository origin/add_sensor_altimeter
Seen branch in repository origin/add_sensor_orientation
Seen branch in repository origin/add_set_position_pid
Seen branch in repository origin/aero_apm_irlock
Seen branch in repository origin/aero_apm_irlock_john
Seen branch in repository origin/ahcorde/gazebo9_reflections
Seen branch in repository origin/alert_system
Seen branch in repository origin/anim
Seen branch in repository origin/animation
Seen branch in repository origin/application_dir
Seen branch in repository origin/ardupilot
Seen branch in repository origin/ardupilot_merge_gazebo8_khancyr
Seen branch in repository origin/ardupilot_nate
... hundred of branches here

Seen 812 remote branches
 > git show-ref --tags -d # timeout=10
Multiple candidate revisions
Scheduling another build to catch up with gazebo-abichecker-any_to_any-ubuntu_auto-amd64
Checking out Revision ab2bbfdd5c8c81cf6bea0db6607e6b390fd38988 (origin/qt_gui_hack)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f ab2bbfdd5c8c81cf6bea0db6607e6b390fd38988 # timeout=10
Commit message: "Update my gui"
First time build. Skipping changelog.

This line seems the origin of the disaster

Multiple candidate revisions
Scheduling another build to catch up with gazebo-abichecker-any_to_any-ubuntu_auto-amd64
j-rivero commented 4 years ago

As I first measure I think that we would need https://github.com/ignition-tooling/release-tools/pull/237 in order to get refspecs from branches (not only PRs) Edited: not the root cause of problem I think.

j-rivero commented 4 years ago

An hypothesis: that job has different parameters as input: SRC_BRANCH but also sha1 which is hidden to the user when called manually.

The workaround at the moment is to always use the Github integration and avoid manual calls.

j-rivero commented 3 years ago

I'm seen the same problem today after using gazebo-ci-manual_any.

scpeters commented 3 years ago

I just cancelled a gazebo-ci-pr_any-homebrew-amd64 job since those jobs have been running excessively

scpeters commented 3 years ago

https://build.osrfoundation.org/job/ignition_gazebo-ci-pr_any-ubuntu_auto-amd64/ is currently running excessively and I believe it's self-triggering

jacobperron commented 3 years ago

The problem is that when we call it manually: before the mapping happens in shell, the Git configuration tries to get sha1 variable which is empty at that moment and that could trigger the storm on new builds.

My knowledge of the groovy scripts is limited, but maybe we can guard against triggering builds for empty sha1 values? e.g.

-          branch('${sha1}')
+          if (sha1?.trim()) {
+            branch('${sha1}')
+          }
scpeters commented 3 years ago

I think a solution for this would be to create separate jobs that can be started manually and tell people not to manually trigger ci-pr_any jobs anymore. I've opened #354 to track this proposed approach.

scpeters commented 3 years ago

the following job is running away today: https://build.osrfoundation.org/job/ign_gui-pr-win/

j-rivero commented 3 years ago

the following job is running away today: https://build.osrfoundation.org/job/ign_gui-pr-win/

Looking into the root cause of it. There was an empty sha1 parameter, the git plugin trigger new builds to try to cover the multiple candidate revisions that it founds:

No credentials specified
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/ignitionrobotics/ign-gui.git # timeout=10
Fetching upstream changes from https://github.com/ignitionrobotics/ign-gui.git
 > git --version # timeout=10
 > git fetch --tags --progress -- https://github.com/ignitionrobotics/ign-gui.git +refs/pull/*:refs/remotes/origin/pr/* +refs/heads/*:refs/remotes/origin/* # timeout=10
Seen branch in repository origin/HiDPI_scaling
Seen branch in repository origin/add_config_extension
Seen branch in repository origin/add_depth_visualization
Seen branch in repository origin/adlarkin/cpplint_fixes
Seen branch in repository origin/ahcorde-patch-1
...
Seen branch in repository origin/update_dome_versions
Seen 246 remote branches
 > git show-ref --tags -d # timeout=10
Multiple candidate revisions
Scheduling another build to catch up with ign_gui-pr-win

The triggering was not being a user but:


[Started by an SCM change](https://build.osrfoundation.org/job/ign_gui-pr-win/578/pollingLog/)
--
  | Revision: 48d596c37c1421446fc39b5298a9a73323fc1223         origin/mingfei-sun/fix-name-inconsistency-in-hello_plugin-e-1559537202850

That branch has not been touched in the last two years and the ghrbp plugin and seems like there is a polling operation going on which is not configured in any way in DSL as far as I can tell.

I've notice a weird thing: the crontab configuration in ghrbp plugin is being set to H/5 0 0 0 0 0 when open the configuration GUI of the job manually while DSL is setting the spec to empty. This weird/missconfigured behavior could be behind the polling operations somehow. I've discovered that together with spec there is another xml parameter, will try to fix it in a PR.

j-rivero commented 3 years ago

My knowledge of the groovy scripts is limited, but maybe we can guard against triggering builds for empty sha1 values? e.g.

Sorry forgot to comment: groovy scripts are used to create the Jenkins jobs, evaluating the sha1 in groovy DSL will run during job generation, not during the job execution. But the idea is good, I will implement something similar as a workaround.

j-rivero commented 11 months ago

Have not seen the problem in the last 3 years. Closing it, please reopen if it appears again,