ros-industrial / industrial_ci

Easy continuous integration repository for ROS repositories
Apache License 2.0
247 stars 129 forks source link

Minimal configuration GitLab CI (and gitlab-runner) #561

Open dave992 opened 4 years ago

dave992 commented 4 years ago

I had some problems getting the Industial CI to run on our GitLab instance using a shared runner installed from scratch. Below I described the configuration, the warnings and errors I received, and finally the steps to resolve these issues.

I am writing this issue mostly since the README.rst, the documentation, and the .gitlab-ci.yaml example seem to indicate that these minimal configuration shown there will work in most, if not all, cases. I do not know if our configuration is special or very specific, but as I needed to make changes to the .gitlab-ci.yaml I did not experience this as a minimal working configuration.

Would it be an idea to update the minimal configuration or instructions in some way to prevent future people from having the same issues?

EDIT: I just found these instructions on GitLab when using Docker in Docker (dind) with the GitLab CI. This does mention the instructions needed to correctly setup the gitlab-runner and the changes needed in the .gitlab-ci.yaml

The used configuration:

I installed the gitlab-runner following the instructions on the GitLab documentation (here), this included installing docker following the instruction (here) and optional post-installation steps (here).

I registered the runner as a shared runner for a GitLab group following the instructions (here).

Using the following .gitlab-ci.yaml:

image: docker:git

variables:
  TMPDIR: "${CI_PROJECT_DIR}.tmp"
  CACHE_DIR: ${CI_PROJECT_DIR}/ccache

cache:
  key: "${CI_JOB_NAME}" # https://docs.gitlab.com/ee/ci/caching/#sharing-caches-across-different-branches
  paths: 
  - ccache

services:
  - docker:19.03.5-dind

before_script:
  - apk add --update bash coreutils tar
  - git clone --quiet --depth 1 https://github.com/ros-industrial/industrial_ci .industrial_ci -b master

melodic:
  script: .industrial_ci/gitlab.sh ROS_DISTRO=melodic

The CI job failed with the following output:

- During the function 'prepare_docker_image' I got the following error:

time="2020-09-23T12:30:13Z" level=error msg="failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial tcp: lookup docker on 131.180.0.25:53: no such host" error during connect: Post http://docker:2375/v1.40/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&pull=1&rm=1&session=kp138scp1e1xdmhcis6u8rxgq&shmsize=0&t=industrial-ci%2Fmelodic%2Fubuntu%3Abionic&target=&ulimits=null&version=1: context canceled <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< Function 'prepare_docker_image' returned with code '1' after 0 min 0 sec ERROR: Job failed: exit code 1



**My actions to resolve the issues:**
I suspected the error was caused by the first warning and tried to resolve this issue first. I was able to stop the warning by changing the `gitlab-runner` configuration (`/etc/gitlab-runner/config.toml`):
`privileged = false` > `privileged = true`

The error I was finally able to solve by following the instructions [(here)](https://about.gitlab.com/releases/2019/07/31/docker-in-docker-with-docker-19-dot-03/). This included the following changes: 
- Adding the `/certs/client` to the `gitlab-runner` configuration:
`volumes = ["/cache"]` > `volumes = ["/certs/client", "/cache"]`
- Adding `DOCKER_TLS_CERTDIR: "/certs"` under `variables` in the `.gitlab-ci.yaml`

After this, the CI job ran as expected and was able to finish the `prepare_docker_image` step. 
mathias-luedtke commented 3 years ago

Would it be an idea to update the minimal configuration or instructions in some way to prevent future people from having the same issues?

Definitely!

since the README.rst, the documentation, and the .gitlab-ci.yaml example seem to indicate that these minimal configuration shown there will work in most, if not all, cases.

These example are only meant to work for the Gitlab.com shared runners.

I just found these instructions on GitLab when using Docker in Docker (dind) with the GitLab CI.

On a local runner I would not use Docker-in-Docker, but bind-mount the Docker socket. With #558 docker builds will not be needed anymore :)

gurbain commented 3 years ago

Wonderful, this post made me gain hours of trying to understand the problem!

RaphvK commented 3 years ago

For security reasons, I would also prefer not to let the GitLab runner start Docker containers that are privileged but to use Docker socket binding instead of Docker-in-Docker.

I am using this configuration in /etc/gitlab-runner/config.toml for the GitLab runner:

[[runners]]
  name = "dell"
  url = "https://gitlab-server/"
  token = "*****"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "ubuntu:20.04"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]

And this CI pipeline in .gitlab.ci.yml:

image: docker:git

variables:
  TMPDIR: "${CI_PROJECT_DIR}.tmp"
  CCACHE_DIR: ${CI_PROJECT_DIR}/ccache

cache:
  key: "${CI_JOB_NAME}" # https://docs.gitlab.com/ee/ci/caching/#sharing-caches-across-different-branches
  paths:
    - ccache

# setup ros-industrial/industrial_ci

before_script:
  - apk add --update bash coreutils tar grep
  - git clone --quiet --depth 1 https://github.com/ros-industrial/industrial_ci .industrial_ci -b master

# setup the actual tests

noetic:
  script: .industrial_ci/gitlab.sh
  variables:  # https://github.com/ros-industrial/industrial_ci/blob/master/doc/index.rst#variables-you-can-configure
    ROS_DISTRO: noetic
    BUILDER: catkin_tools
    TARGET_WORKSPACE: . .repos
    ADDITIONAL_DEBS: git
    AFTER_INIT_EMBED: git config --global url.${CI_SERVER_PROTOCOL}://gitlab-ci-token:${CI_JOB_TOKEN}@${CI_SERVER_HOST}:${CI_SERVER_PORT}.insteadOf ${CI_SERVER_URL}  # https://github.com/ros-industrial/industrial_ci/pull/594#issue-561205512

But the pipeline fails with this error:

pull_docker_image 00:02
'pull_docker_image' returned with code '0' after 0 min 1 sec
Copy credentials: /root/.docker
/bin/bash: /builds/automated-driving/deep_lidar_grid_mapping/.industrial_ci/industrial_ci/src/run.sh: No such file or directory
Cleaning up file based variables 00:01
ERROR: Job failed: exit code 127

Does anybody have a working solution using a GitLab runner with Docker socket binding instead of Docker-in-Docker?