Closed iandees closed 7 months ago
I don't see anything in Cloudformation specifying an EBS volume for the batch compute environment, so I think we're using the default root volume in the default AMI. We could create a special-case AMI to use for these Batch jobs that has more storage configured or we could figure out how to attach a second EBS volume just for these jobs? Last I checked, Batch doesn't have great support for this.
I made a new Launch Template (here) that has default storage of 300GiB instead of 250.
I used the Lambda console to manually kick off a batch-prod-CollectLambdaScheduleFunction
to start the collect process and will see if the extra storage helps.
Whatever is starting the EC2 instance used by the mega
batch compute environment is not using the new launch template, so it's ignoring my storage size change.
Nick made a change in 1252532842ae78ddf6a665e00ef5e76a597a71f2 that increased the volume size.
The collection script finished and updated collections are flowing again.
Collections haven't updated since mid-March. Looking at the logs, it seems that the collection creation job is failing because the instance running this job runs out of disk space: