openpolicyagent/opa docker container fails to start on both github actions and circleci

jsteinberg-rbi commented 2 years ago

Short description

openpolicyagent/opa does not run on github actions or circleci.

Examples:

# from github actions

Checking docker version
Clean up resources from previous jobs
Create local container network
Starting job container
  /usr/bin/docker pull openpolicyagent/opa:0.45.0
  0.45.0: Pulling from openpolicyagent/opa
  79e0d[8](https://github.com/baseline-observability-poc/actions/runs/3352602117/jobs/5554851388#step:2:9)860fad: Pulling fs layer
  bf75762436b0: Pulling fs layer
  a1f1879bb7de: Pulling fs layer
  915d36c1c2c4: Pulling fs layer
  915d36c1c2c4: Waiting
  a1f1879bb7de: Download complete
  79e0d8860fad: Verifying Checksum
  79e0d8860fad: Download complete
  bf75762436b0: Verifying Checksum
  bf75762436b0: Download complete
  79e0d8860fad: Pull complete
  915d36c1c2c4: Verifying Checksum
  915d36c1c2c4: Download complete
  bf75762436b0: Pull complete
  a1f1879bb7de: Pull complete
  915d36c1c2c4: Pull complete
  Digest: sha256:385979a156dbe413fd894ffe85[11](https://github.com/baseline-observability-poc/actions/runs/3352602117/jobs/5554851388#step:2:13)ccd3f81a81858275c119d3984e68f604c53e
  Status: Downloaded newer image for openpolicyagent/opa:0.45.0
  docker.io/openpolicyagent/opa:0.45.0
  /usr/bin/docker create --name df5ad0956a2c43ea8948927db6e83883_openpolicyagentopa0450_86b83a --label 8d5581 --workdir /__w/cbaseline-observability-poc/baseline-observability-poc --network github_network_ef24d51fb6684ed6b0d9723b45597059  -e "HOME=/github/home" -e GITHUB_ACTIONS=true -e CI=true -v "/var/run/docker.sock":"/var/run/docker.sock" -v "/home/runner/work":"/__w" -v "/home/runner/runners/2.298.2/externals":"/__e":ro -v "/home/runner/work/_temp":"/__w/_temp" -v "/home/runner/work/_actions":"/__w/_actions" -v "/opt/hostedtoolcache":"/__t" -v "/home/runner/work/_temp/_github_home":"/github/home" -v "/home/runner/work/_temp/_github_workflow":"/github/workflow" --entrypoint "tail" openpolicyagent/opa:0.45.0 "-f" "/dev/null"
  2217fd32076eaa43d4[15](https://github.com/rbilabs/ctg-baseline-observability-poc/actions/runs/3352602117/jobs/5554851388#step:2:18)abe3f6353f6e36e2b1[16](https://github.com/baseline-observability-poc/actions/runs/3352602117/jobs/5554851388#step:2:19)184991c8b7ac21deb1f49053
  /usr/bin/docker start 22[17](https://github.com/rbilabs/ctg-baseline-observability-poc/actions/runs/3352602117/jobs/5554851388#step:2:20)fd32076eaa43d415abe3f6353f6e36e2b116[18](https://github.com/rbilabs/ctg-baseline-observability-poc/actions/runs/3352602117/jobs/5554851388#step:2:21)4991c8b7ac21deb1f49053
  Error response from daemon: failed to create shim: OCI runtime create failed: runc create failed: unable to start container process: exec: "tail": executable file not found in $PATH: unknown
  Error: failed to start containers: 2217fd3[20](https://github.com/rbilabs/ctg-baseline-observability-poc/actions/runs/3352602117/jobs/5554851388#step:2:23)76eaa43d415abe3f6353f6e36e2b116184991c8b7ac[21](https://github.com/rbilabs/ctg-baseline-observability-poc/actions/runs/3352602117/jobs/5554851388#step:2:24)deb1f49053
  Error: Docker start fail with exit code 1

Unfortunately circleci does not provide any debug output method. I have attached a picture to show what circleci does show so that it is convincing that the opa image is successfully pulled, but similar to github fails to start.

Fixing the github actions issue is in a sense straightfoward: it's looking for tail and running opa:$tag-debug solves that issue, although I'm not sure running the debug container is what is desired by the group. Regardless running debug does not resolve the issue with circleci.

Fwiw openpolicyagent/conftest does run out of the box, as I'm sure most other openpolicyagent products do.

Steps To Reproduce

# in a github repo
# mkdir .github/workflows
# touch .github/workflows/opa.yml
# inside opa.yml write

name: container
on: push

jobs:
  opa:
    runs-on: ubuntu-latest
    container:
      image: openpolicyagent/opa:0.45.0
    steps:
         - name: Test opa version
           run: |
             opa version

Also on circleci

# create dummy circleci account
# inside repo:
# touch config.yml

version: 2.1

jobs:
  opa:
    docker:
      - image: openpolicyagent/opa
    steps:
      - run:
           command: opa version

workflows:
  opa-test
    jobs:
      - opa

Expected behavior

binary is located upon startup

Additional Thoughts

The reason I'm raising this is because on the one hand I appreciate an OSS group's inability to cater to all systems, I'd imagine it's utterly impossible. Still: github actions is probably going to be the world's most popular simple CI workflow tool if it isn't already and circleci is a giant, comparatively, in the industry, so at the very least I think updating documentation to say, "hey fyi these containers don't work out of the box on these systems" would be very helpful. And maybe you can reach out to circle and find a stable path forward because I'm sure that the opa container not running on it without some hack is not what the group wants.

anderseknert commented 2 years ago

Good evening, Jonas 🙂

--entrypoint "tail" openpolicyagent/opa:0.45.0 "-f" "/dev/null"

Yeah, that's not going to work... and I don't think we'll want to "fix" that in OPA :(

This looks relevant.

I haven't worked with CircleCI yet, but for GitHub Actions, I think the common CI workflow wrt OPA uses the setup-opa task. Are there any benefits of running OPA as a container in this context?

jsteinberg-rbi commented 2 years ago

thanks as usual @anderseknert, you're an amazing steward of this community.

I think using a container is generally a best practice now regardless of the system because most people are running all their workloads on scheduled container systems whether they be serverless or kubernetes so everything ends up containerized and I don't expect OPA to be any different, but I'll leave that to y'all.

Caveat that github actions was actually not my concern, I was merely using it as a proving ground against CircleCI.

CircleCI support did get back to me and gave a really excellent response. Now personally I think this is critical information to know that will trip most people up that aren't advanced users of container runtimes. Here's Circle's response:

Hi there,

Thank you for reaching out to CircleCI Support.
I am sorry to hear that you were unable to run jobs using the openpolicyagent/opa Docker images on CircleCI.

As you have observed, we would want to use the -debug Docker image variant (e.g., openpolicyagent/opa:0.45.0-debug).
This is needed when running a Docker executor job on CircleCI, since we need to execute the shell commands declared in a job's steps.

However, there is a nuance to use the openpolicyagent/opa:0.45.0-debug Docker image on CircleCI.
In particular, the entrypoint of the image is pointing to the the /opa binary, though shell is available under sh.
We can confirm this by inspecting the image:

$ docker image inspect openpolicyagent/opa:0.45.0-debug | jq '.[0].Config | with_entries(select(.key | in({"Cmd":1, "Entrypoint":1, "Shell":1})))'
{
  "Cmd": [
    "run"
  ],
  "Entrypoint": [
    "/opa"
  ]
}
As such, to use this image, we would need to explicitly "tell" CircleCI to use sh as the entrypoint instead.

You can thus declare your job like this:

jobs:
  an-example:
    docker:
      - image: openpolicyagent/opa:0.45.0-debug
        # *-debug images from opa are distroless,
        # but offer shell at sh
        entrypoint: sh
    steps:
      - run: /opa version

I have made a sample project here for your reference: [https://github.com/kelvintaywl-cci/openpolicyagent](https://urldefense.com/v3/__https://github.com/kelvintaywl-cci/openpolicyagent__;!!IM-1ARIE!cgT1H7r7Y1NDT8n3MAw4asOnZvYF9RyKANtFD-QF9tA8BN3EKZe_lGWD2u9aENlohriJ7ufQurO_rmH0dcTKdjk$)

We can also see the successful build here:
[https://app.circleci.com/pipelines/github/kelvintaywl-cci/openpolicyagent/2/workflows/bcd1a0c6-505f-4993-981e-cb762780e8af/jobs/2](https://urldefense.com/v3/__https://app.circleci.com/pipelines/github/kelvintaywl-cci/openpolicyagent/2/workflows/bcd1a0c6-505f-4993-981e-cb762780e8af/jobs/2__;!!IM-1ARIE!cgT1H7r7Y1NDT8n3MAw4asOnZvYF9RyKANtFD-QF9tA8BN3EKZe_lGWD2u9aENlohriJ7ufQurO_rmH0zFMYH-A$)

I hope this explains how Docker executor jobs are run on CircleCI.

For additional reference, I think our documentation on custom-built Docker images shed some light on this too:
E.g., [https://circleci.com/docs/custom-images/#adding-an-entrypoint](https://urldefense.com/v3/__https://circleci.com/docs/custom-images/*adding-an-entrypoint__;Iw!!IM-1ARIE!cgT1H7r7Y1NDT8n3MAw4asOnZvYF9RyKANtFD-QF9tA8BN3EKZe_lGWD2u9aENlohriJ7ufQurO_rmH0sUfQyTc$)

I think running Opa on Circle as a docker workload, like every other workload that runs on Circle or any other CI practically, is a common thing and probably some people have cracked this nut already. Were it me I would list this excellent response above as official Opa documentation. It could even be generalized as a best-guess document for why a non-distroless (debug) container might not run on a CI...because I think this is going to be a problem on them all, just like it is for both github actions, gitlab and circleci.

But if this is such common knowledge that it's not worth documenting that's fine with me too, whatever y'all think is best. I'll leave the closing of the PR to you naturally.

anderseknert commented 2 years ago

Thank you for the kind words, @jsteinberg-rbi! ❤️

I'm not sure I agree about the claim of "any other CI" standardizing on using containerized workloads to run simple steps like e.g. eval or tests, but I have no evidence other than the projects from my own little bubble, and of course that this has not been reported as a problem in the past... but, FWIW, I agree that documentation is probably the best way to help ensure that OPA works seamlessly in CI/CD contexts for most of the build systems out there. There's an old issue (filed by me, even!) about adding a dedicated section/tutorial for this to our docs.

If I start by writing one for GitHub Actions, maybe you could amend it with a CircleCI section? 🙂

srenatus commented 2 years ago

One tangential thought, we could also tag the '-debug' images with '-ci' to indicate that they work for that specific use case (when the standard images don't)

jsteinberg-rbi commented 2 years ago

@anderseknert absolutely -- just @me in this issue and I'll submit my PR.

@srenatus that's a great idea.

stale[bot] commented 1 year ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.