martinthomson / i-d-template

A template for IETF internet draft git repositories
Other
208 stars 182 forks source link

Possible problem with CircleCI in current docker image? #293

Closed SpencerDawkins closed 3 years ago

SpencerDawkins commented 3 years ago

The Cellar working group is suddenly seeing a problem with martinthomson/i-d-template in CircleCi, that has just popped in the past week or 10 days.

Our repo is https://github.com/ietf-wg-mops/draft-ietf-mops-streaming-opcons.

The actual errors reported by CircleCi were as follows:

!/bin/bash -eo pipefail

make 'CLONE_ARGS=--reference ~/git-reference' make: /bin/sh: Operation not permitted git clone -q --depth 10 --reference ~/git-reference \ -b main https://github.com/martinthomson/i-d-template lib make: /bin/sh: Operation not permitted Makefile:2: lib/main.mk: No such file or directory make: *** [Makefile:9: lib/main.mk] Error 127

Exited with code exit status 2 CircleCI received exit code 2

I was seeing this in https://github.com/ietf-wg-mops/draft-ietf-mops-streaming-opcons/pull/75, since merged and closed, but @squarooticus created https://github.com/ietf-wg-mops/draft-ietf-mops-streaming-opcons/pull/76, which reverted the docker image and seems to have fixed the problem.

I know that @squarooticus was also talking to @GrumpyOldTroll, so tagging him here as well, to close the loop.

I just wanted to make sure someone knew :-)

martinthomson commented 3 years ago

cc @laurencelundblade

I've no idea what is going on here. The message seems to indicate that 'make' can't execute the shell. Circle is able to execute the shell itself, but make can't do the same, which is extremely bizarre. I'm also getting the same for the CircleCI tasks on this repo (which uses a similar but not identical image), so something is clearly busted.

I rebuilt the image myself and it was fine. I also downloaded the version that was failing and that was also fine.

Anyway, I've kicked off a task to rebuild the images (the first failed, likely thanks to flakiness on the IETF servers) and there should now be a new image active. Retriggering a build might work (after reverting the hash pin). Or you could move to GitHub Actions, where no issues are evident.

larseggert commented 3 years ago

Is there any benefit to even supporting Circle anymore? I routinely remove the config from my I-D repos now.

SpencerDawkins commented 3 years ago

@martinthomson - thanks for taking a look at this - and I appreciate the suggestion about GitHub Actions.

SpencerDawkins commented 3 years ago

@larseggert - you guys are the pros about this, but in my limited experience, the MOPS WG repo has had three "failed checks" episodes that I'm aware of, and two were legit (something that humans needed to fix with our draft, not Circle-ID weirdness).

I'm only asking this here because I'm not sure where else to ask it - is there a right place for me to ask questions that might be of interest to other WG repo users/owners? I'd love to at least be aware of discussions about CircleCI, GitHub Actions, and continuous integration more broadly, and this issue in Martin's repo seems an odd place to monitor that discussion 😁 .

squarooticus commented 3 years ago

Or you could move to GitHub Actions

I didn't realize this was now the recommended approach. I haven't kept up with developments in i-d CI, but just kept setting up repos the same way I always have been. I'll look into transitioning to GH Actions. Thanks.

chucklever commented 3 years ago

I'm hitting the same problem in https://github.com/chucklever/i-d-rpcrdma-version-two/ : The CircleCI build fails during the Build Drafts step with "make: /bin/sh: Operation not permitted". I am happy to attempt transitioning to GitHub Actions, but I know nothing about CircleCI or GH Actions. Is there a recipe somewhere I can follow to get back on the road?

chucklever commented 3 years ago

I am happy to attempt transitioning to GitHub Actions, but I know nothing about CircleCI or GH Actions. Is there a recipe somewhere I can follow to get back on the road?

Replying to leave breadcrumbs for others to follow...

In another i-d repo that I had created recently, I found a set of .github/workflows that @martinthomson already created for this purpose. Copying them to i-d-rpcrdma-version-two and disabling the CircleCI webhook (via the repo settings) seems to have straightened everything out.

mcr commented 3 years ago

I am happy to attempt transitioning to GitHub Actions, but I know nothing about CircleCI or GH Actions. Is there a recipe somewhere I can follow to get back on the road?

Replying to leave breadcrumbs for others to follow...

In another i-d repo that I had created recently, I found a set of .github/workflows that @martinthomson already created for this purpose. Copying them to i-d-rpcrdma-version-two and disabling the CircleCI webhook (via the repo settings) seems to have straightened everything out.

I get this (operation not permitted) on a newly created repo (well, it's not really new, but we are doing 8366bis). So what is that you are copying from/to?

martinthomson commented 3 years ago

The (twice-monthly) regular image build ran and that seems to have corrupted the image again. I still have no idea what is going on here, so I've re-run the build (which might paper over the problem for a couple of weeks) and I'll try to find time to work through this in more detail.

My current thinking is that this has something to do with operating as a non-root user, which is something CircleCI used to do. It might pay to just remove the user-related lines from the Dockerfile.

chucklever commented 3 years ago

I get this (operation not permitted) on a newly created repo (well, it's not really new, but we are doing 8366bis). So what is that you are copying from/to?

If you don't have access to an i-d repo that was recently instantiated, try something like this from your repo's top-level directory:

  1. Update your repo's local copy of the template infrastructure using "make update"
  2. Copy the files lib/template/.github/workflows/*.yml to .github/workflows/
  3. Using your favorite porcelain, "git add .github/workflows/*.yml", "git commit", then "git push origin" (salt to taste: you might prefer to commit the files to a separate branch and merge them instead)
  4. Disable the webhook for CircleCI (found on the Settings page for your repo)

@martinthomson Are there targets in lib/*.mk to refresh the workflows if you happen to fix bugs or add new workflow files?

mcr commented 3 years ago
  1. Copy the files lib/template/.github/workflows/*.yml to .github/workflows/

This didn't update anything, (my files were already up-to-date). I expect the problem is with this container.

martinthomson commented 3 years ago

I still don't understand what is going on with the image, but it is clearly broken. I've managed to create a CircleCI configuration that uses the same image as GitHub Actions. This appears to work.

To update:

make update
# or: git -C lib pull
cp lib/template/.circleci/config.yml .circleci/config.yml
git add .circleci/config.yml
git commit -m "Update CircleCI config" .circleci/config.yml
git push

To switch to GitHub Actions:

make update
# or: git -C lib pull
mkdir -p .github/workflows
cp lib/template/.github/workflows/* .github/workflows
git add .github/workflows
git rm -f .circleci
git commit -m "Switch to GitHub Actions"
git push

You should then go into the CircleCI dashboard and stop following the project. You might need to explicitly enable GitHub Actions also (though the default is for it to be enabled).

To @chucklever's question, I haven't got a target that does this automatically. Adding the above is possible.

martinthomson commented 3 years ago

OK, I can report that success there was a false alarm; it was pulling an older image. I'm investigating other options, but will ask for patience.

squarooticus commented 3 years ago

Switching to Github Actions using @martinthomson 's instructions was completely trivial, so I would second that recommendation. I've done this for the OpCons repo (n.b. @SpencerDawkins) as well as my own.

SpencerDawkins commented 3 years ago

@squarooticus - noted! And Thank You. We're having an editors conference call later today.

martinthomson commented 3 years ago

OK, so lots of work was required, but I think that I squared this away for now. I have a working build on CircleCI that uses a recent image. (There are improvements to the build process on my end there, so it wasn't a complete waste of time.) If anyone encounters a similar issue, please let me know.

martinthomson commented 3 years ago

Anyone curious about the reason for this, it was a change in Alpine that interacted poorly with CircleCI. I think that it's fixed, but I will be forced to pin the Alpine version if this keeps happening. Please let me know if you have more problems.

martinthomson commented 3 years ago

Circle have updated their docker runners so we shouldn't see a recurrence of this issue (at least in this particular incarnation).