aws-samples / aws-eda-slurm-cluster

AWS Slurm Cluster for EDA Workloads
MIT No Attribution
24 stars 7 forks source link

[BUG] Stack update fails because json payload too large #179

Closed cartalla closed 6 months ago

cartalla commented 7 months ago

Describe the bug When the ParallelCluster config gets large it can be too large to be passed as json to a lambda. This seems to affect 3.6 and 3.7, not 3.8. The reason why the json has to be passed is that it contains CloudFormation anchors that must be substituted during stack deployment. The s3 objects are written inside the lambda based on the resolved values in the json. Not sure how to easily resolve this other than to save the unresolved config to s3 and pass the resolved anchor values as parameters that get substituted in the lambda.

cartalla commented 6 months ago

I solved this with the build configuration files by uploading jinja2 templates to S3 and passing the tokens as parameters to a Lambda. Then I can render the templates and save them back to S3. This as the additional advantage that I can easily debug the Lambda without having to create a test event with the json payload.