Frequently when provisioning nodes on GCP the service will return a 500, typically if the project is requesting many VMS at the same time.
This means if we provision 12 targets and one fails to provision, the entire workflow run fails, and we must examine if its a true failure or a provision failure. we would like it to retry on failure to mitigate this
This should retry a a fixed amount of times (2 failures in a row are uncommon) when it receives a 500 from the remote desitination
Describe Alternatives You've Considered
we have attempted looping the workflows in github actions on failures, but github actions really does not like this and our solutions are a little "hacky"
Use Case
Frequently when provisioning nodes on GCP the service will return a 500, typically if the project is requesting many VMS at the same time. This means if we provision 12 targets and one fails to provision, the entire workflow run fails, and we must examine if its a true failure or a provision failure. we would like it to retry on failure to mitigate this
Describe the Solution You Would Like
https://github.com/puppetlabs/provision/blob/main/tasks/provision_service.rb#L67-L82
This should retry a a fixed amount of times (2 failures in a row are uncommon) when it receives a 500 from the remote desitination
Describe Alternatives You've Considered
we have attempted looping the workflows in github actions on failures, but github actions really does not like this and our solutions are a little "hacky"