project-copacetic / copa-action

:octocat: GitHub Action for Copacetic: Directly patch container image vulnerabilities
https://project-copacetic.github.io/copacetic/website/
MIT License
17 stars 7 forks source link

feat: Add support for retrying copa patch on failure/timeout #49

Open SaptarshiSarkar12 opened 1 week ago

SaptarshiSarkar12 commented 1 week ago

Problem

Due to network issues, copa patch fails (timed out) often. It is tedious to re-run GitHub actions jobs on failure each time.

Solution proposed

We can add an optional GitHub actions input - max_attempts which would store the number of times the copa patch would be run if it failed. Moreover, at the last the workflow can print the number of attempts before it succeeded.

Additional Information

I would like to work on this issue if the maintainers approve this issue.

ashnamehrotra commented 1 week ago

@SaptarshiSarkar12 thanks for the issue! I think a better solution to timeout error would be changing the timeout arg in copa which we already support through the copa action. If the timeout is not changed from default (5 min) and copa patch is failing due to network, running it multiple times will still result in the same error.

SaptarshiSarkar12 commented 1 week ago

@SaptarshiSarkar12 thanks for the issue! I think a better solution to timeout error would be changing the timeout arg in copa which we already support through the copa action. If the timeout is not changed from default (5 min) and copa patch is failing due to network, running it multiple times will still result in the same error.

@ashnamehrotra I have changed the timeout to 10 mins, but it still reports timeout error. I am currently using retry-action which seems to handle the issue properly by re-running the step when it fails. But that is a 3rd party solution. I was looking for an official solution from Copa. You can check these workflow runs where that retry action has automated the re-run of the failed step :point_down:

and many more. I hope these workflow runs are adequate to prove the essence of adding this feature. Please let me know your views on it.

ashnamehrotra commented 1 week ago

@SaptarshiSarkar12 thats interesting, out of curiosity, if you use the default and remove docker/setup-buildx-action@v3.6.1, does it still result in the same timeout error?

SaptarshiSarkar12 commented 1 week ago

@SaptarshiSarkar12 thats interesting, out of curiosity, if you use the default and remove docker/setup-buildx-action@v3.6.1, does it still result in the same timeout error?

@ashnamehrotra They work totally fine, but the overall workflow fails because cache export is not supported by the current docker buildx installed by default. But the timeout problem occurs in dev-docker-build workflow not for docker-publish workflow as the patch of oraclelinux starts at different time with a difference of approximately 2 mins.

ashnamehrotra commented 3 days ago

@SaptarshiSarkar12 got it, I will assign this issue to you thanks!

SaptarshiSarkar12 commented 2 days ago

Thank you @ashnamehrotra for assigning the issue to me :smile:.