aws-samples / aws-batch-runtime-monitoring

Serverless application to monitor an AWS Batch architecture through dashboards.
MIT No Attribution
58 stars 16 forks source link

Job transitions in state other than RUNNABLE are not collected #17

Open adamantike opened 2 years ago

adamantike commented 2 years ago

We have deployed this project to our AWS account, but after triggering AWS Batch jobs, we only get metrics for the RUNNABLE state.

This is an example of CloudWatch error log for when the State Machine tries to run for a state transition.

{"id":"10","type":"ChoiceStateEntered","details":{"input":"{\"LastEventType\":\"SUCCEEDED\",\"JobQueue\":\"arn:aws:batch:us-east-1:123456789012:job-queue/prod-data-retrieval\",\"JobName\":\"RetrieveData\",\"Region\":\"us-east-1\",\"LastEventTime\":\"2022-06-10T17:42:54Z\",\"JobId\":\"096ad8ae-83c4-434e-b541-2114ecd19fa3\",\"JobDefinition\":\"arn:aws:batch:us-east-1:123456789012:job-definition/prod-retrieve-data:78\"}","inputDetails":{"truncated":false},"name":"Is Attempted?"},"previous_event_id":"9","event_timestamp":"1654882975216","execution_arn":"arn:aws:states:us-east-1:123456789012:express:JobStatesStateMachineServerless-DBT9CAIdCeAT:c9becabb-d991-4af7-a1ec-85586f8e1ea0_b47f626b-423e-4c9f-9380-14fd92de9c71:57741e9b-a01f-45ea-95ef-ad7e1f78eb72"}

{"id":"11","type":"ExecutionFailed","details":{"cause":"An error occurred while executing the state 'Is Attempted?' (entered at the event id #10). Unable to apply Path transformation to null or empty input.","error":"States.Runtime"},"previous_event_id":"0","event_timestamp":"1654882975216","execution_arn":"arn:aws:states:us-east-1:123456789012:express:JobStatesStateMachineServerless-DLU9BBIkCzEL:c9becabb-d991-4af7-a1ec-85586f8e1ea0_b47f626b-423e-4c9f-9380-14fd92de9c71:57741e9b-a01f-45ea-95ef-ad7e1f78eb72"}

It's worth to mention that we only use Fargate clusters, so we haven't tested if state transitions work correctly for EC2-based clusters.

adamantike commented 2 years ago

@perifaws, based on your reaction, please let me know if I can help by providing more context/debugging info for this issue to be solved.

I haven't had the chance to dig deeper, but would really like for this project to support Fargate loads, so it's useful to a bigger audience!

perifaws commented 1 year ago

Hey @adamantike , Fargate is not supported at the moment. We'll evaluate adding it. Happy for contributions on this one.