Closed byashimov closed 6 months ago
Please fix it using a different approach. The fix is quite simple, we need to wait until the resource reaches
RUNNING
state before attempting to delete it. So we need to add a wait in theCreate
method, and wait until it transitions fromINITIALIZING
intoRUNNING
. The same fix needs to be applied over here, where we need to includeINITIALIZING
as one of the possible options for thePending
state.
It might not reach running state. On my tests the resource stuck in RESTARTING. I strongly believe that implementing the state machine here is a bad idea.
On my tests the resource stuck in RESTARTING.
Would you mind sharing how you were testing it? Because I think this might happen if there's some kind of a misconfiguration that took place in the resource.
On my tests the resource stuck in RESTARTING.
Would you mind sharing how you were testing it? Because I think this might happen if there's some kind of a misconfiguration that took place in the resource.
Here you go https://github.com/aiven/terraform-provider-aiven/actions/runs/7865584033/job/21458721920#step:6:19
Here you go https://github.com/aiven/terraform-provider-aiven/actions/runs/7865584033/job/21458721920#step:6:19
It says that the service was in the INITIALIZING
state, I don't see any mentions of RESTARTING
there. As I mentioned, a fix is to add INITIALIZING
state over here, and this issue won't occur anymore: https://github.com/aiven/terraform-provider-aiven/blob/main/internal/schemautil/wait.go#L34.
Either way, let’s get the hotfix applied to the other Flink resources besides the application deployment, and let’s get it merged.
Here you go https://github.com/aiven/terraform-provider-aiven/actions/runs/7865584033/job/21458721920#step:6:19
It says that the service was in the
INITIALIZING
state, I don't see any mentions ofRESTARTING
there. As I mentioned, a fix is to addINITIALIZING
state over here, and this issue won't occur anymore: https://github.com/aiven/terraform-provider-aiven/blob/main/internal/schemautil/wait.go#L34.Either way, let’s get the hotfix applied to the other Flink resources besides the application deployment, and let’s get it merged.
Please make sure you are checking application deployment status and not the service's.
Here you go https://github.com/aiven/terraform-provider-aiven/actions/runs/7865584033/job/21458721920#step:6:19
It says that the service was in the
INITIALIZING
state, I don't see any mentions ofRESTARTING
there. As I mentioned, a fix is to addINITIALIZING
state over here, and this issue won't occur anymore: https://github.com/aiven/terraform-provider-aiven/blob/main/internal/schemautil/wait.go#L34. Either way, let’s get the hotfix applied to the other Flink resources besides the application deployment, and let’s get it merged.Please make sure you are checking application deployment status and not the service's.
When I click on your link I see this:
Error: error cancelling Flink Application Deployment: 4[22](https://github.com/aiven/terraform-provider-aiven/actions/runs/7865584033/job/21458721920#step:6:23): {"errors":[{"message":"Impossible transition from status INITIALIZING to status CANCELLING_REQUESTED for deployment 3f59043c-a2d5-4c21-b15a-26419d6b2f4e","status":422}],"message":"Impossible transition from status INITIALIZING to status CANCELLING_REQUESTED for deployment 3f59043c-a2d5-4c21-b15a-26419d6b2f4e"} -
Am I missing something?
Here you go https://github.com/aiven/terraform-provider-aiven/actions/runs/7865584033/job/21458721920#step:6:19
It says that the service was in the
INITIALIZING
state, I don't see any mentions ofRESTARTING
there. As I mentioned, a fix is to addINITIALIZING
state over here, and this issue won't occur anymore: https://github.com/aiven/terraform-provider-aiven/blob/main/internal/schemautil/wait.go#L34. Either way, let’s get the hotfix applied to the other Flink resources besides the application deployment, and let’s get it merged.Please make sure you are checking application deployment status and not the service's.
When I click on your link I see this:
Error: error cancelling Flink Application Deployment: 4[22](https://github.com/aiven/terraform-provider-aiven/actions/runs/7865584033/job/21458721920#step:6:23): {"errors":[{"message":"Impossible transition from status INITIALIZING to status CANCELLING_REQUESTED for deployment 3f59043c-a2d5-4c21-b15a-26419d6b2f4e","status":422}],"message":"Impossible transition from status INITIALIZING to status CANCELLING_REQUESTED for deployment 3f59043c-a2d5-4c21-b15a-26419d6b2f4e"} -
Am I missing something?
Go check yourself. Running right now.
That's a great find, @byashimov! Looks like the SQL in the test is invalid, and it causes Flink to crash. I will adjust this behavior, and will add a separate test for this scenario.
About this change—what it does
Tests run fast and fail to delete Flink Application Deployment in transitional statuses. The state machine for this type of resource is quite complicated. Instead of trying to replicate the transition flow this PR proposes different approach: it calls deletion over and over until succeeds or exceeds context timeout.