aws-amplify / amplify-category-api

The AWS Amplify CLI is a toolchain for simplifying serverless web and mobile development. This plugin provides functionality for the API category, allowing for the creation and management of GraphQL and REST based backends for your amplify project.
https://docs.amplify.aws/
Apache License 2.0
88 stars 76 forks source link

Failure to build stack #1854

Open DougalW opened 1 year ago

DougalW commented 1 year ago

How did you install the Amplify CLI?

npm

If applicable, what version of Node.js are you using?

19.5.0

Amplify CLI Version

12.4.0

What operating system are you using?

MacOS 12.6.3

Did you make any manual changes to the cloud resources managed by Amplify? Please describe the changes made.

None

Describe the bug

We have a schema.graphql with 54 tables and 31 enums. There are 40 many:many relations across the tables.

This is by no means a massive schema, and a good size for a moderately complex SaaS app of this type. I would expect that Amplify could handle this size schema.

We are experiencing four types of errors:

the EMFILE error ConnectionStack errors "Limit on the number of resources in a single stack operation exceeded" errors Parameter count > than 200 allowed max Failure modes occur with both amplify push and push with force directive.

In detail:

Issue 1. - Have been getting this error after the schema table count went over around 40: Raised in support Case ID: 13727852561

I tried to push a GraphQL schema update to Amplify with the following command: amplify push --force --allow-destructive-graphql-schema-updates It keeps failing with the following error:

Deployment status
Deployment failed 04/09/2023, 19:45:10: EMFILE: too many open files, open '/tmp/amplify-07edc1f5-8083-426b-9391-a7ce62209b51/amplify/backend/api/ClimateDisclosure/build/states/01/resolvers/MetricsDataSet.Assets.res.vtl' .

Then rolls back my push. This issue has been raised before here: https://github.com/aws-amplify/amplify-studio/issues/414 and appeared to be resolved by increasing the ulimit on the Job container.

Issue 2. - This error is intermittent.

E.g. when creating a totally new stack:

Embedded stack arn:aws:cloudformation:ap-southeast-2:253256353881:stack/amplify-climatedisclosure-deleteme-130810-apiClimateDisclosure-YGEXC2DFOWA3/b42ad8e0-5040-11ee-864c-026a6ec0826a was not successfully created: The following resource(s) failed to create: [ConnectionStack].

...and...

Name: ConnectionStack (AWS::CloudFormation::Stack), Event Type: create, Reason: Embedded stack arn:aws:cloudformation:ap-southeast-2:253256353881:stack/amplify-climatedisclosure-deleteme-130810-apiClimateDisclosure-YGEXC2D-ConnectionStack-U6F63L520F4L/6fad0390-5041-11ee-8807-0ab22e2dcf8a was not successfully created: Limit on the number of resources in a single stack operation exceeded, IsCustomResource: false

Issue 3. - Have been getting this error after the schema table count went over 50:

amplify push --force --allow-destructive-graphql-schema-updates kept failing:

Reason: Embedded stack arn:aws:cloudformation:ap-southeast-2:253256353881:stack/amplify-climatedisclosure-dev-112157-apiClimateDisclosure-KQIOIHG9OUB-ConnectionStack-YZNJE7CIQID9/2a714690-e293-11ed-8126-0a25b4139496 was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: Limit on the number of resources in a single stack operation exceeded


Issue 4. - Have been getting this error after the schema table count went over around 50:

"ResourceType": "AWS::CloudFormation::Stack",
            "Timestamp": "2023-09-06T00:16:36.119000+00:00",
            "ResourceStatus": "UPDATE_FAILED",
            "ResourceStatusReason": "Template format error: Parameter count 214 is greater than max allowed 200",
            "ResourceProperties": "{\"TemplateURL\":\"https://s3.ap-southeast-2.amazonaws.com/amplify-climatedisclosure-dev-112157-deployment/amplify-appsync-files/53f8e3dffea7a051e934983e6a45f6b8f8619c5d/stacks/ConnectionStack.json\",\"Parameters
\":

We also see other errors like this:

šŸ›‘ Cannot iteratively rollback as the following step does not contain a previousMetaKey: {"status":"DEPLOYING"}

Learn more at: https://docs.amplify.aws/cli/project/troubleshooting/

IterativeRollbackError: Cannot iteratively rollback as the following step does not contain a previousMetaKey: {"status":"DEPLOYING"}
    at runIterativeRollback (/snapshot/amplify-cli/build/node_modules/@aws-amplify/amplify-provider-awscloudformation/lib/iterative-deployment/iterative-rollback.js:44:13)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Object.run (/snapshot/amplify-cli/build/node_modules/@aws-amplify/amplify-provider-awscloudformation/lib/push-resources.js:127:9)
    at async /snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/extensions/amplify-helpers/push-resources.js:137:16
    at async Promise.all (index 0)
    at async providersPush (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/extensions/amplify-helpers/push-resources.js:133:5)
    at async AmplifyToolkit.pushResources (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/extensions/amplify-helpers/push-resources.js:107:13)
    at async Object.executeAmplifyCommand (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/index.js:194:9)
    at async executePluginModuleCommand (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/execution-manager.js:139:5)
    at async executeCommand (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/execution-manager.js:37:9)
    at async Object.run (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/index.js:121:5)

Session Identifier: 8ef8a9c1-ec0f-41f0-b8b3-f767ff85312e

and this:

Deploying iterative update 2 of 2 into schemadev environment. This will take a few minutes. ā ¼
Deploying api ClimateDisclosure [ =--------------------------------------- ] 2/107
    GraphQLAPIDefaultApiKey215A6Dā€¦ AWS::AppSync::ApiKey           UPDATE_IN_PROGRESS             Wed Sep 06 2023 11:23:06ā€¦     
    GraphQLAPITransformerSchema3Cā€¦ AWS::AppSync::GraphQLSchema    UPDATE_COMPLETE                Wed Sep 06 2023 11:23:41ā€¦     
šŸ›‘ table name should be passed

Resolution: Please report this issue at https://github.com/aws-amplify/amplify-cli/issues and include the project identifier from: 'amplify diagnose --send-report'
Learn more at: https://docs.amplify.aws/cli/project/troubleshooting/

UnknownNodeJSFault: table name should be passed
    at nodeErrorToAmplifyException (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/amplify-exception-handler.js:144:12)
    at handleException (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/amplify-exception-handler.js:24:28)
    at process.<anonymous> (/snapshot/amplify-cli/build/node_modules/@aws-amplify/cli-internal/lib/index.js:52:93)
    at process.emit (node:events:513:28)
    at process.processEmit [as emit] (/snapshot/amplify-cli/build/node_modules/signal-exit/index.js:199:34)
    at process._fatalException (node:internal/process/execution:149:25)

table name should be passed
AssertionError [ERR_ASSERTION]: table name should be passed
    at DeploymentManager.getTableStatus (/snapshot/amplify-cli/build/node_modules/@aws-amplify/amplify-provider-awscloudformation/lib/iterative-deployment/deployment-manager.js:325:28)
    at invokeFunc (/snapshot/amplify-cli/build/node_modules/lodash.throttle/index.js:160:19)

Each of these errors has increased in frequency as the table count increased.

These seem to be caused by Amplify not correctly creating CloudFormation stacks inside CloudFormation resource limits. Some of these have been raised in these issues: -https://github.com/aws-amplify/amplify-cli/issues/9762#issuecomment-1116445920 (@ Straubulous closed that issue on 4 May 2022 but we are still getting both errors with the latest version of the CLI).

https://github.com/aws-amplify/amplify-studio/issues/414 but there is no real fix. I've seen this suggested, but tbh it feels like a hack and doesn't work around the resource limit issue: https://docs.amplify.aws/cli/graphql/override/#place-appsync-resolvers-in-custom-named-stacks

Please advise what to do - this makes Amplify impractical for any moderately sophisticated schema. The resource issue also makes Amplify impractical because we will very quickly hit the resource limit per-accountID.

Expected behavior

Amplify should successfully parse the schema and generated needed backend resources. It currently parses the schema just fine, but fails with the resources.

Reproduction steps

Reproducing requires trying to run a build of our project - reach out to me for access to this.

Project Identifier

5c7bc147366fc1111fe82b901e8f35f8

Log output

``` # Put your logs below this line ```

Additional information

No response

Before submitting, please confirm:

DougalW commented 1 year ago

Correction: 55 tables

DougalW commented 1 year ago

I reduced the number of tables to 52, pushed again, and it failed: šŸ›‘ The following resources failed to deploy: Resource Name: ConnectionStack (AWS::CloudFormation::Stack) Event Type: update Reason: Embedded stack arn:aws:cloudformation:ap-southeast-2:253256353881:stack/amplify-climatedisclosure-schemadev-184417-apiClimateDisclosure-1JJNPZ-ConnectionStack-ZGUHWWI1V3D6/0e470f50-3e5e-11ee-a46f-02d236a3bb2e was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: Limit on the number of resources in a single stack operation exceeded URL: https://console.aws.amazon.com/cloudformation/home?region=ap-southeast-2#/stacks/arn%3Aaws%3Acloudformation%3Aap-southeast-2%3A253256353881%3Astack%2Famplify-climatedisclosure-schemadev-184417-apiClimateDisclosure-1JJNPZWB03U59%2Fe0d43120-3e5c-11ee-a569-02e6440f255c/events

šŸ›‘ Resource is not in the state stackUpdateComplete Name: ConnectionStack (AWS::CloudFormation::Stack), Event Type: update, Reason: Embedded stack arn:aws:cloudformation:ap-southeast-2:253256353881:stack/amplify-climatedisclosure-schemadev-184417-apiClimateDisclosure-1JJNPZ-ConnectionStack-ZGUHWWI1V3D6/0e470f50-3e5e-11ee-a46f-02d236a3bb2e was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: Limit on the number of resources in a single stack operation exceeded, IsCustomResource: false

Learn more at: https://docs.amplify.aws/cli/project/troubleshooting/

Session Identifier: 38b21643-f8e0-44f5-ab7d-a3c7acddf7de

I can email the report if needed

DougalW commented 1 year ago

We also tried the recommended approach in:

https://docs.amplify.aws/cli/graphql/override/#place-appsync-resolvers-in-custom-named-stacks

But this failed with Reason: Circular dependency between resources:

same as in https://github.com/aws-amplify/amplify-category-api/issues/32

ykethan commented 1 year ago

Hey @DougalW, šŸ‘‹ thanks for raising this! I'm going to transfer this over to our API repository for better assistance šŸ™‚.

dpilch commented 1 year ago

Here is a possible solution to unblock the custom named stack workaround.

Consider this example:

"StackMapping": {
  "GetAResolver": "StackFoo",
  "GetBResolver": "StackBar",
  "GetCResolver": "StackFoo" // creates circular dependency
}

With this configuration I have created a circular dependency.

StackFoo depends on StackBar via A to B relationship StackBar depends on StackFoo via B to C relationship Changing GetCResolver to StackBar would remove the circular dependency.

chrisbonifacio commented 12 months ago

Hi šŸ‘‹ Closing this as we have not heard back from you. If you are still experiencing this issue and in need of assistance, please feel free to comment and provide us with any information previously requested by our team members so we can re-open this issue and be better able to assist you.

Thank you!

DougalW commented 12 months ago

I wasn't aware anyone had requested additional information! Please point me to where this has been requested.

@dpilich provided a suggestion, which we looked at but this will only solve the connection stack issue and won't alleviate the resource issue (Amplify creates too many cloudformation resources). This approach is also very fragile and manual.

We have instead been pursuing an approach to reduce the dimensionality of our data model, which is very far from ideal. I will report back once we have better data.

What I was hoping to see is a response to reduce the number of resources Amplify creates, built in to Amplify itself. It's not reasonable to expect your customers to keep making these fundamental tweaks just to get something to deploy successfully.

I don't think you should close this issue - it's clearly still there and experienced by others, and something you need to fix in Amplify otherwise your customers will stop using Amplify for anything more than toy apps.

DougalW commented 12 months ago

I also just saw this: https://github.com/aws-amplify/amplify-category-api/issues/1859 Which is related to the issue we're facing so I don't understand why you have closed my issue.

chrisbonifacio commented 11 months ago

Hi @DougalW šŸ‘‹ I closed the issue simply because we hadn't received a response within a week but we will always re-open an issue once we do. I understand the provided solution is only a workaround and does not result in a good developer experience. I will reach out to the team with the feedback provided so that we can come to a better solution.

BBopanna commented 8 months ago

+1 to the whole paradigm of amplify+appsync falling apart around 50+ tables - AWS team how do you expect enterprise grade apps to scale with these kinds of limitations ? and just to be backed by RUSH to close valid scenarios. Work arounds are just cumbersome and painful, organizations question this tech stack offered by you Amplify!! Pathetic to say the least Amplify team!

INCREASE THE DAMN LIMITS AWS AMPLIFY TEAM !

Amirbahal commented 8 months ago

Have you tried removing the amplify folder in your project and running an amplify pull again? :)

chirpavel commented 1 month ago

I have the same error while deploy:

2024-08-09T06:15:18.144Z [INFO]: šŸ›‘ The following resources failed to deploy:
2024-08-09T06:15:18.153Z [INFO]: Resource Name: BranchPublicInfo (AWS::CloudFormation::Stack)
Event Type: update
Reason: Embedded stack arn:aws:cloudformation:eu-central-1:675795832684:stack/amplify-XXXXX-main-XXXXX-apiXXXXX-R5I7DCBEDRZM-BranchPublicInfo-RO7LC76AQ9ON/97908130-5300-11ef-b9c9-06184733758b was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: Limit on the number of resources in a single stack operation exceeded

URL: https://console.aws.amazon.com/cloudformation/home?region=eu-central-1#/stacks/arn%3Aaws%3Acloudformation%3Aeu-central-1%3A675795832684%3Astack%2Famplify-XXXXX-main-XXXXX-apiXXXXX-R5I7DCBEDRZM%2Ffb232ec0-5182-11ef-976b-0257b97cb813/events

šŸ›‘ Resource is not in the state stackUpdateComplete

Name: BranchPublicInfo (AWS::CloudFormation::Stack), Event Type: update, Reason: Embedded stack arn:aws:cloudformation:eu-central-1:675795832684:stack/amplify-XXXXX-main-XXXXX-apiXXXXX-R5I7DCBEDRZM-BranchPublicInfo-RO7LC76AQ9ON/97908130-5300-11ef-b9c9-06184733758b was not successfully updated. Currently in UPDATE_ROLLBACK_IN_PROGRESS with reason: Limit on the number of resources in a single stack operation exceeded, IsCustomResource: false

And I can't understand what limit exactly. There are no additional details in CloudFormation, the same text and status UPDATE_FAILED. I don't understand what I violated, so I can think about it, improve it and fix it. I have very few tables, about 10 and 3 functions. That's it, we can't move anywhere again, a dead end

chirpavel commented 1 month ago

How can I understand which limit I have violated? What is the limit value? How can I view or manually count the number of elements that fall under this limit?

There should be some hints in the logs. This will reduce the number of requests and tasks, people will be able to do something themselves.