Open d-huck opened 8 months ago
Hey @d-huck, 👋 thanks for raising this! From the logs provided it appear this occurring when pushing a API resource.
I'm going to transfer this over to our API repository for better assistance. But wanted to mention, you may need to remove the deployment-state.json
file in the S3 deployment if present. On the amplify push
command could you try adding a --debug
for verbose logging.
@ykethan Thank you for the response. I had already attempted removing delpoyment-state.json
after making the original post to no avail. Running with debug gives the following message:
Stack:arn:aws:cloudformation:us-east-1:xxxxxxxxxxx:stack/amplify-vxxxo-main-130331/245af600-74f3-11ee-9f3e-0a3e5b9c2ce5 is in UPDATE_ROLLBACK_FAILED state and can not be updated.
PushResourcesFault: Stack:arn:aws:cloudformation:us-east-1:xxxxxxxxxx:stack/amplify-vxxxo-main-130331/245af600-74f3-11ee-9f3e-0a3e5b9c2ce5 is in UPDATE_ROLLBACK_FAILED state and can not be updated.
at AmplifyToolkit.pushResources (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/extensions/amplify-helpers/push-resources.js:116:23)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.executeAmplifyCommand (/snapshot/amplify-cli/build/node_modules/@aws-amplify/amplify-category-api/lib/index.js:231:9)
at async executePluginModuleCommand (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/execution-manager.js:139:5)
before it resumes and hangs indefinitely. Attempting to continue rollback in cloudformation results in the same failure. Diving down the stack in CloudFormation, I see this error message:
The following resource(s) failed to update: [SubscriptiononDeleteUserResolver, UserdiscordIdResolver, DeleteUserResolver, UserownerResolver, CreateUserResolver, QuerygetUserByDiscordIdResolver, SubscriptiononUpdateUserResolver, SubscriptiononCreateUserResolver, UseremailResolver, GetUserResolver, ListUserResolver, UserphoneResolver, UpdateUserResolver].
which contains many of the resolvers for the object type I initially attempted to remove and am now attempting to restore. However, I can't push anything until this rollback completes, which is seemingly a Catch-22
Creating fake resolvers may allow you to get out of the UPDATE_ROLLBACK_FAILED state. https://github.com/aws-amplify/amplify-category-api/issues/2157#issuecomment-1868341419
Thanks for the link. I followed #2157 and was able to get past that into UPDATE_ROLLBACK_COMPLETE
on the stack. There was some more strange behavior following this. First, it gave me the "Cannot perform more than one GSI creation or deletion in a single update". I removed the index tag from the table I'm trying to remove as well as deleted the index from the DynamoDB console. After moving past that I'm just stuck with an infinite
🛑 ["Index: 0 State: {\"deploy\":\"waitingForDeployment\"} Message: Resource is not in the state stackUpdateComplete"]
I've tried deleting the deployment-state from the deployment s3 bucket. But I'm not able to move forward with the last known good schema.
Are there any additional error messages on the CloudFormation console?
There are a few possible solutions in https://github.com/aws-amplify/amplify-category-api/issues/92.
@dpilch Thank you for the link. I had been up and down that thread and was hoping for solutions other than the ones that were proposed there. In the end, we ended up destroying the api and associated tables and rebuilt them fresh, which wasn't as catastrophic as it could have been considering we're at a very early stage. It seems like this problem is common among people who are making large, quick changes to their backend, so hopefully we won't be facing this in the future. I'll spare y'all the rant about this being unacceptable, because I assume y'all have read #92 in detail.
I'll leave our solution for anyone who may find themselves on this page in the future. First, if you've tried the normal things, don't hold out for a solution, just follow #92. Here's our resolution steps:
amplify function update
, select the function and ensure the API is unselected. schema.graphql
file, because the next step will remove it.amplify remove api
deployment-state.json
if it exists.amplify push
ampliy add api
. Select blank or template, you'll overwrite it in the next stepschema.graphql
amplify push
Outside of the backup and restoration process, this whole process takes roughly 30 minutes.
I am sharing this because a similar event has occurred.
[ Problem ]
schema.graphql
, I added 5 GSIs for one table at the same time. (previous state was 0 GSI) amplify push
Cannot perform more than one GSI creation or deletion in a single update
occurred and deploy failed.[ after problem ]
Try amplify push
, but on initial-state deployment, I got Cannot perform more than one GSI creation or deletion in a single update
.
[ Cause ] The error was caused by the following difference.
As a result, there was a difference of more than 2 between the number of GSIs in the CloudFormation Stack and the number of GSIs in the CloudFormation Template that Amplify first attempts to deploy.
[ Recovery ]
#current-cloud-backend.zip
under the s3 bucket (amplify-appid-envname-xxxxx-deployment) (although Amplify officially forbids modification)unzip #current-cloud-backend.zip
GlobalSecondaryIndexes
and AttributeDefinitions
in api/apiid/build/stacks/<tablename>.json
to the CloudFormation Stack contents.(3 GSI)api/apiid/schema.graphql
, set the target table's @index(...)
to match the 3 GSIs in the CloudFormation Stack.
5.cd #current-cloud-backend
.
6.zip -r ... /#current-cloud-backend.zip *
amplify/#current-cloud-backend/
)Amplify CLI version : 12.8.2
How did you install the Amplify CLI?
npm
If applicable, what version of Node.js are you using?
21.2.0
Amplify CLI Version
12.10.1
What operating system are you using?
MacOS
Did you make any manual changes to the cloud resources managed by Amplify? Please describe the changes made.
The only manual changes are a few custom images for running lambdas online. Based on other issues similar to mine, I attempted to make dummy resolvers to clear the UPDATE_ROLLBACK_FAILED state to no avail. These dummy resolvers have been removed.
Describe the bug
When pushing updates from our dev environment to production, API building failed due to an object being removed from the API. The API push has an unfortunately large number of changes due to our frontend devlopment lagging far behind. The commands used for pushing to main were:
Which failed after roughly 30 minutes of the CLI doing its thing. The result is our cloudformation stack is in UPDATE_ROLLBACK_FAILED and cannot be cleared out of this state.
After attempting to rollback, I have reverted the schema to the last known working state and pushed, which results in the behavior of an indefinite hang of the CLI. The last known working state does not remove the object in question. To further debug this, I pulled the main environment down using
amplify pull
, added a comment line to trigger a rebuild, and experience the same behavior where the CLI hangs and does not move forward. Our production environment has been offline for 12 hours now, which is generally considered to be a bad thing.Expected behavior
Pushing changes from environment to another should work or at least leave things in a revertible state..
Reproduction steps
Not sure if this can be reproduced in an empty directory, I have never experienced this level of amplify failing before.
Project Identifier
a7f88f1c8eb39da02933e54e978f3c1e
Log output
Additional information
Before submitting, please confirm: