aws / codecatalyst-blueprints

Apache License 2.0
48 stars 18 forks source link

CDKDeployAction fails for gen-ai-chatbot - "no space left on device" #577

Open jamesfreeman959 opened 2 weeks ago

jamesfreeman959 commented 2 weeks ago

Describe the bug

I attempted to install the gen-ai-chatbot workflow today in an almost blank AWS account, in us-west-1. Everything seems to go well until the CDKDeployAction stage where the workflow fails. I note in the workflow logs these entries:

Error processing tar file(exit status 1): write /usr/local/lib/python3.11/site-packages/torch/lib/libtorch_cpu.so: no space left on device
BedrockChatBotStack-dli7sh11:  fail: docker build --tag cdkasset-6a370a2d2473385a0c53557b9dd3dc3e038d66b41d2330b7bbddb9dfda48251f --file embedding.Dockerfile --platform linux/amd64 . exited with error code 1: DEPRECATED: The legacy builder is deprecated and will be removed in a future release.
Install the buildx component to build images with BuildKit:
https://docs.docker.com/go/buildx/

Error processing tar file(exit status 1): write /usr/local/lib/python3.11/site-packages/torch/lib/libtorch_cpu.so: no space left on device

❌ Deployment failed: Error: Failed to build asset 6a370a2d2473385a0c53557b9dd3dc3e038d66b41d2330b7bbddb9dfda48251f:current_account-us-west-2
at Deployments.buildSingleAsset (/usr/local/npm/lib/node_modules/aws-cdk/lib/index.js:430:11312)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async Object.buildAsset (/usr/local/npm/lib/node_modules/aws-cdk/lib/index.js:430:196295)
at async /usr/local/npm/lib/node_modules/aws-cdk/lib/index.js:430:180311
Failed to build asset 6a370a2d2473385a0c53557b9dd3dc3e038d66b41d2330b7bbddb9dfda48251f:current_account-us-west-2

::set-output name=ACTION_RUN_SUMMARY::[{text:CDK_DEPLOY_COMMAND_ERROR,level:Error,message:"The AWS CDK deploy action failed to perform one or more commands. Check the action logs for more information."}]
Error: The AWS CDK deploy action failed to perform one or more commands. Check the action logs for more information.

Is there something that needs to be set or changed for this to work correctly?

Steps to reproduce

Steps to reproduce the behavior:

  1. Enable Bedrock modesl in us-west-1
  2. Set up a new CodeCatalyst environment in us-west-1, set up a new space.
  3. Select "Start with a blueprint"
  4. Select the "LLM Playground" (currently showing version 0.3.133"
  5. Deploy the gen-ai-chatbot workflow
  6. Deploy using default settings. Create the default IAM role suited for development environments only.
  7. Deployment stages all pass until the CDKDeployAction fails with the no space left on device error

Expected behavior

All stages should complete successfully.

Version information

N/A

Additional context

Add any other context about the problem here.

aggagen commented 1 week ago

Hey, thanks for the bug report. It looks like there is an issue with this blueprint where its deployment workflow's cdk deploy action runs out of storage space when running in a CodeCatalyst space that's in the free billing tier. A workaround is to upgrade your space to the standard or enterprise tiers.

We're still investigating the free tier issue.