Add "retry" command to spackbot.

spack / spackbot

Spack maintainer bot 🤖

https://spack.github.io/spackbot/

Other

8 stars 12 forks source link

Add "retry" command to spackbot. #60

Open josephsnyder opened 2 years ago

josephsnyder commented 2 years ago

Utilize the "retry" API endpoint for pipelines to re-run the jobs that had failed during the most recent failed run of a pipeline.

It does require a new comment to listen to and a call to acquire the ID of the most recently failed pipeline before triggering the jobs to be retried.

josephsnyder commented 2 years ago

@scottwittenburg, I've been testing this using a combination of the docker-compose system and a non-spack repository (https://github.com/josephsnyder/spack_docs/pull/1) I told my local instance of spackbot to listen on this repository.

I haven't gotten one to officially "work" via spackbot yet since the branches there don't match any that would be on SPACK's GitLab but they work when manually hitting the API endpoint for the development GitLab instance I run.

josephsnyder commented 2 years ago

I've added a few new parts: it now grabs all pipelines and wont try to restart a pipeline if the 0th one is a success. It also tries to find the last one of failed, skipped, and canceled.

I can't quite test the processing as I still am receiving a 404 when my copy of spackbot is trying to query the GitLab projects API. I am able to see it from a browser and and I have put in my PAT to the GITLAB_TOKEN entry in the .env file. Can you think of something that I might have missed?

tldahlgren commented 1 year ago

Not sure/don't recall why this hasn't been merged but if it is active it needs resolution of the conflicts.

scottwittenburg commented 1 year ago

This is a year and half old, and I don't really recall why we wanted it. Right now it seems to me to be kind of risky to expose this functionality since we know that when you use the UI to retry jobs in gitlab, the job runs with the environment variables given to it when the job was created. This effectively means you can't retry jobs created more than 12 hrs ago, or the aws credentials will have expired. Maybe we hadn't implemented credential rotation yet when this was done? I'm not sure.

At any rate, I think we should close this unless someone has a good argument otherwise.