rockcarver / frodo-cli

A CLI to manage ForgeRock platform deployments supporting Identity Cloud tenants, ForgeOps deployments, and classic deployments.
MIT License
19 stars 18 forks source link

ESV Apply, while waiting for restart, can terminate with a http 500 error #218

Closed ashleyfrieze closed 1 year ago

ashleyfrieze commented 1 year ago

Frodo CLI version

0.23.0

Describe the issue

We have been trying to apply some variable changes to our staging tenant. This seems to take about 500 seconds to restart during the apply process. However, we're experiencing about 95% failure rate in the frodo CLI during restarts.

Almost always, we get to about 430 seconds into the process, and the job aborts having received a 500 error from the remote server while polling for restart status. This could represent an issue in ForgeRock Identity Cloud, not serving valid statuses 100% of the time, but it should also be tolerated by Frodo, since the only option available to us after the pipeline has failed at this point is to restart and hope the error doesn't happen... which is a rare occurrence.

image

Maybe Frodo, inside the wait loop, can have a certain number of retries allowed for HTTP 500 errors. It looks like Frodo is able to poll the server again just afterwards to determine it's still restarting (i.e. if we run apply again straight away, we get told that the restart is happening from last time).

Note: the error is specifically happening in the middle of the polling period.

vscheuber commented 1 year ago

I have noticed that myself and it appears it has gotten worse. When I originally implemented the feature, I would run into this 1 out of 10 times. Now it is almost the reversed ratio...

I have considered addressing it in the same way as you suggest, @ashleyfrieze and will look into it.

If you are interested: put up a PR :)