ros-industrial / industrial_ci

Easy continuous integration repository for ROS repositories
Apache License 2.0
248 stars 129 forks source link

Bitbucket "authorization denied" --> API call limitation? #882

Closed AndyZe closed 2 months ago

AndyZe commented 2 months ago

On Sep 17 I started experiencing this issue on my bitbucket pipelines:

time="2024-09-18T19:00:50.149245108Z" level=error msg="AuthZRequest for HEAD /_ping returned error: authorization denied by plugin pipelines: "

I guess it's probably not a coincidence that bitbucket deprecated legacy pipelines on Sep 17.

There are some solutions described here

Basically it sounds like this is deprecated: DOCKER_BUILDKIT=0

Maybe the new way to do it is: export PATH=/usr/bin:$PATH

Here's my bitbucket-pipelines.yml with the fix I attempted. It didn't work.

image: docker:git

pipelines:
  default:
     - step:
         services:
           - docker
         script:
           # Attempted fix
           # See https://community.atlassian.com/t5/Bitbucket-questions/Docker-build-failing-for-buildkit-with-error-authorization/qaq-p/2377667#M94515
           - export PATH=/usr/bin:$PATH
           - apk add --update bash coreutils tar
           - git clone --quiet --depth 1 https://github.com/ros-industrial/industrial_ci .industrial_ci -b master
           - .industrial_ci/bitbucket.sh ROS_DISTRO=humble

definitions:
  services:
    docker:
      memory: 2048
mathias-luedtke commented 2 months ago

Thanks for reporting!

time="2024-09-18T19:00:50.149245108Z" level=error msg="AuthZRequest for HEAD /_ping returned error: authorization denied by plugin pipelines: "

Could you give me a little bit more context, please? This looks like the "docker pull" fails or gets blocked. You could set TRACE=true as well, which should print all the internal industrial_ci commands, which should help to narrow the scope.

Basically it sounds like this is deprecated: DOCKER_BUILDKIT=0

This has been deprecated (by docker) for a while now. And it does not matter, because industrial_ci does not build any docker images. It just runs them.

Maybe the new way to do it is: export PATH=/usr/bin:$PATH

I would not mess with $PATH unless there is a good reason to do that. To debug that more: Could you add an env call to your script and see what is the default for PATH. And does bitbucket set any DOCKER_* environment variables?

mathias-luedtke commented 2 months ago

According to https://support.atlassian.com/bitbucket-cloud/docs/run-docker-commands-in-bitbucket-pipelines/ running docker containers should be a supported use case. image: docker:git is not listed in any of the examples, though.

And it looks like bitbucket restricts the locations that can be mounted via volumes. Please try setting DOCKER_CREDENTIALS="". This should disable the extra-mounts, but as well the option to clone other repos.

If Docker does work anymore, there is as well the ISOLATION=shelloption: https://github.com/ros-industrial/industrial_ci/blob/master/.gitlab-ci.yml#L39

AndyZe commented 2 months ago

I added this stuff to bitbucket-pipelines.yml. Didn't seem to change anything, though-

image: docker:git
pipelines:
  default:
     - step:
         size: 2x # More memory
         services:
           - docker
         script:
           - export TRACE=true
           - env
           - export DOCKER_CREDENTIALS=""
           - git clone git@bitbucket.org:apptronik/elevate_robotics.git
           - apk add --update bash coreutils tar
           - git clone --quiet --depth 1 https://github.com/ros-industrial/industrial_ci .industrial_ci -b master
           - .industrial_ci/bitbucket.sh ROS_DISTRO=humble
definitions:
  services:
    docker:
      memory: 4096

-->

Build tab ... Setting up ros-humble-ackermann-steering-controller (2.37.3-1jammy.20240911.165226) ... Setting up ros-humble-tricycle-steering-controller (2.37.3-1jammy.20240911.165229) ... Setting up ros-humble-bicycle-steering-controller (2.37.3-1jammy.20240911.165228) ... Setting up ros-humble-ros2-controllers (2.37.3-1jammy.20240911.165933) ... <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< 'install_target_dependencies' returned with code '0' after 1 min 49 sec TRACE:util.sh:208 ici_hook after_install_target_dependencies

docker tab ... time="2024-09-19T15:17:33.492890622Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1 time="2024-09-19T15:17:33.492911474Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1 time="2024-09-19T15:17:33.493167953Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.pause\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1 time="2024-09-19T15:17:33.650782870Z" level=info msg="ignoring event" container=5d9f648cf4e8cbd631df2e89534d58550cc785d2a7316d5f65c72b7b627f948e module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" time="2024-09-19T15:17:33.650808975Z" level=info msg="shim disconnected" id=5d9f648cf4e8cbd631df2e89534d58550cc785d2a7316d5f65c72b7b627f948e namespace=moby time="2024-09-19T15:17:33.650862345Z" level=warning msg="cleaning up after shim disconnected" id=5d9f648cf4e8cbd631df2e89534d58550cc785d2a7316d5f65c72b7b627f948e namespace=moby time="2024-09-19T15:17:33.650873086Z" level=info msg="cleaning up dead shim" namespace=moby time="2024-09-19T15:17:33Z" level=info msg="Pipelines plugin request authorization." allowed=false method=HEAD plugin=pipelines uri=/_ping time="2024-09-19T15:17:33.857952559Z" level=error msg="AuthZRequest for HEAD /_ping returned error: authorization denied by plugin pipelines: " time="2024-09-19T15:17:33Z" level=info msg="Pipelines plugin request authorization." allowed=true method=GET plugin=pipelines uri=/_ping time="2024-09-19T15:17:33Z" level=info msg="Pipelines plugin request authorization." allowed=true method=GET plugin=pipelines uri=/v1.44/containers/9319249432c6f9b7f2ed4e10cdead000a947c5b49a4c9347aa5266f19836c03b/json time="2024-09-19T15:17:33Z" level=info msg="Pipelines plugin request authorization." allowed=true method=POST plugin=pipelines uri="/v1.44/containers/9319249432c6f9b7f2ed4e10cdead000a947c5b49a4c9347aa5266f19836c03b/attach?stderr=1&stdout=1&stream=1" time="2024-09-19T15:17:33Z" level=info msg="Pipelines plugin request authorization." allowed=true method=POST plugin=pipelines uri="/v1.44/containers/9319249432c6f9b7f2ed4e10cdead000a947c5b49a4c9347aa5266f19836c03b/wait?condition=removed" time="2024-09-19T15:17:33Z" level=info msg="Pipelines plugin request authorization." allowed=true method=POST plugin=pipelines uri=/v1.44/containers/9319249432c6f9b7f2ed4e10cdead000a947c5b49a4c9347aa5266f19836c03b/start time="2024-09-19T15:17:33.905484049Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1 time="2024-09-19T15:17:33.905795247Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1 time="2024-09-19T15:17:33.908529910Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1 time="2024-09-19T15:17:33.908904756Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.pause\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1

AndyZe commented 2 months ago

Here's the env you asked for:

+ env
BITBUCKET_SSH_KEY_FILE=/opt/atlassian/pipelines/agent/ssh/id_rsa
BITBUCKET_REPO_UUID={413cc327-3443-45a1-bf0d-5a8f446b71c9}
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT=tcp://10.34.224.1:443
CI=true
HOSTNAME=a70749b6-7bc4-4417-bf35-c6680ce44179-fk95q
BITBUCKET_STEP_TRIGGERER_UUID={d5ee4493-ec50-4736-919e-2b0c386ec051}
BITBUCKET_REPO_SLUG=aladdin_monitors
SHLVL=3
HOME=/root
BITBUCKET_GIT_SSH_ORIGIN=git@bitbucket.org:apptronik/aladdin_monitors.git
BITBUCKET_REPO_OWNER=apptronik
BITBUCKET_STEP_UUID={a70749b6-7bc4-4417-bf35-c6680ce44179}
BITBUCKET_BUILD_NUMBER=13
BITBUCKET_WORKSPACE=apptronik
BITBUCKET_CLONE_DIR=/opt/atlassian/pipelines/agent/build
BITBUCKET_GIT_HTTP_ORIGIN=http://bitbucket.org/apptronik/aladdin_monitors
BITBUCKET_REPO_IS_PRIVATE=true
PIPELINES_JWT_TOKEN=$PIPELINES_JWT_TOKEN
NPM_CONFIG_USER=65534
BITBUCKET_COMMIT=ebdf04f32a398291df1f1899931aaf35236962e8
DIND_COMMIT=65cfcc28ab37cb75e1560e4b4738719c07c6618e
BITBUCKET_PIPELINE_UUID={1316958a-3471-4095-b122-7802c08085d5}
BITBUCKET_DOCKER_HOST_INTERNAL=10.38.175.11
KUBERNETES_PORT_443_TCP_ADDR=10.34.224.1
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
KUBERNETES_PORT_443_TCP_PORT=443
BITBUCKET_REPO_OWNER_UUID={d5a60e11-1f8d-4fc5-adbf-f55f315a1124}
KUBERNETES_PORT_443_TCP_PROTO=tcp
BITBUCKET_STEP_RUN_NUMBER=1
BITBUCKET_BRANCH=main
DOCKER_VERSION=25.0.5
DOCKER_TLS_CERTDIR=/certs
BITBUCKET_REPO_FULL_NAME=apptronik/aladdin_monitors
DOCKER_HOST=tcp://localhost:2375
BITBUCKET_PROJECT_UUID={3462a40f-cc88-4642-86e5-cb294a61ebf7}
BITBUCKET_PROJECT_KEY=SCOR
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.34.224.1:443
DOCKER_BUILDX_VERSION=0.13.1
DOCKER_COMPOSE_VERSION=2.25.0
KUBERNETES_SERVICE_HOST=10.34.224.1
PWD=/opt/atlassian/pipelines/agent/build
TRACE=true
AndyZe commented 2 months ago

This may all be a red herring. I have other bitbucket pipelines that are still succeeding, same bitbucket-pipelines.yml. Very confused...

mathias-luedtke commented 2 months ago

According to your logs, industrial_ci is running and even managed to start the docker container and install things. What happens, if you remove "image: docker:git"?

AndyZe commented 2 months ago

Same symptoms unfortunately. The build tab is stuck at...

install_target_dependencies
...
Setting up ros-humble-moveit-resources-panda-moveit-config (2.0.7-1jammy.20240830.212942) ...
Setting up adwaita-icon-theme (41.0-1ubuntu1) ...
Setting up humanity-icon-theme (0.6.16) ...
Setting up ubuntu-mono (20.10-0ubuntu2) ...
Setting up libgtk-3-0:amd64 (3.24.33-1ubuntu2.2) ...
Setting up libgtk-3-bin (3.24.33-1ubuntu2.2) ...
Setting up qt5-gtk-platformtheme:amd64 (5.15.3+dfsg-2ubuntu0.2) ...

The docker tab is stuck at... (don't ask me why there's still a docker tab)

time="2024-09-21T16:47:04Z" level=info msg="Pipelines plugin request authorization." allowed=true method=POST plugin=pipelines uri=/v1.41/containers/2ca37a1066c1860ebd3ca85ec48b165e2943bfb1f3db83ee87ae710ec7a7dd84/start
time="2024-09-21T16:47:04.711954506Z" level=info msg="loading plugin \"io.containerd.event.v1.publisher\"..." runtime=io.containerd.runc.v2 type=io.containerd.event.v1
time="2024-09-21T16:47:04.712371367Z" level=info msg="loading plugin \"io.containerd.internal.v1.shutdown\"..." runtime=io.containerd.runc.v2 type=io.containerd.internal.v1
time="2024-09-21T16:47:04.712393985Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.task\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1
time="2024-09-21T16:47:04.712543646Z" level=info msg="loading plugin \"io.containerd.ttrpc.v1.pause\"..." runtime=io.containerd.runc.v2 type=io.containerd.ttrpc.v1

stuck

AndyZe commented 2 months ago

Maybe relevant:

I guess there is some sort of rate limiting mechanism in Bitbucket pipelines causing only the first few pulls to succeed from ecr. I solved the issue by starting SAM with --skip-pull-image option.

https://stackoverflow.com/q/78424795 https://support.atlassian.com/bitbucket-cloud/docs/api-request-limits/

Is the docker image that industrial_ci uses based on ros-desktop-full? Maybe that would help?

AndyZe commented 2 months ago

OK, if I remove some deps, I can at least get to the point where the tests are run. They don't pass because controller_manager package is missing but that verifies that it's a bitbucket API call limitation.

mathias-luedtke commented 2 months ago

OK, if I remove some deps, I can at least get to the point where the tests are run. They don't pass because controller_manager package is missing but that verifies that it's a bitbucket API call limitation.

For me it looks like 2h time limit for the pipeline. (https://community.atlassian.com/t5/Bitbucket-questions/Increasing-pipeline-runner-time-limits/qaq-p/2013093) industial_ci should only pull the image once and then start it. How long does it take locally? (https://github.com/ros-industrial/industrial_ci/blob/master/doc/index.rst#run-industrial-ci-on-local-host)

Is the docker image that industrial_ci uses based on ros-desktop-full? Maybe that would help?

No, industrial_ci usesubuntu:$codename by default. You can override it with ´DOCKER_IMAGE'.

AndyZe commented 2 months ago

It only takes about 30s to build and another 30s to run tests locally. It's not a huge pkg at all but it does have a lot of dependencies (moveit2 & ros2_control).

OK, this got me to the point where CI quickly runs tests. I'm happy with that! Closing the issue, thanks for the hints.

         script:
           - export DOCKER_IMAGE="ros:humble"

I think what's going on is it hits the bitbucket API limit then just spins until it times out.