TotallyGatsby / DroneYard

AWS Batch based automation for OpenDroneMap.
GNU General Public License v3.0
13 stars 3 forks source link

AWS Batch jobs stuck in runnable state, compute environment becomes invalid #9

Open egs40 opened 1 month ago

egs40 commented 1 month ago

Thank you for your work on this project. I've encountered an issue while deploying the code following the instructions:

  1. Updated package.json to use "sst": "^2.41.4"
  2. Deployed using r7gd.4xlarge instances in the eu-west-1 region
  3. Uploaded images as instructed and started the workflow

Problem:

The AWS Batch dashboard shows a valid and enabled compute environment Jobs enter a runnable state but don't progress Jobs remain in runnable status indefinitely The compute environment eventually becomes invalid

I received a notification stating that all EC2 instances in the Batch compute environment were scaled down due to a misconfiguration preventing them from joining the ECS Cluster. The notification suggests reviewing and updating/recreating the compute environment configuration, mentioning possible issues such as:

Any insights on what might be causing this issue would be appreciated.

TotallyGatsby commented 1 month ago

It's very probable that the resources I had set up for this project have gotten too old. I have to get new footage myself in the next month or so of my property, so I'll take a look when I do at what seems to be the problem.

egs40 commented 1 month ago

Thanks, I appreciate it.

AlexCarusoFan4 commented 1 day ago

Hi there,

I recently ran into the same issue with my DroneYard solution (https://github.com/AlexCarusoFan4/WinyamaDroneYard).

Looks like instances are launched, but never registered with the ECS cluster. I tried using the latest ECS optimised Amazon Linux AMI, and completely re-deploying my solution, but neither of these worked.

Today I refactored the solution to use the latest aws-cdk-lib for Batch, rather than relying on the alpha package, and am now having success with running imagery processing jobs again.

Would definitely recommend giving that a try - hopefully does the trick for you.