department-of-veterans-affairs / notification-api

Notification API
MIT License
16 stars 9 forks source link

Workaround - Graceful Shutdown of ECS Tasks #1266

Open ldraney opened 1 year ago

ldraney commented 1 year ago

User Story - Business Need

Minimize the risk of data loss and ensure the continuity of the application's services during the task termination process. By implementing a deregulation delay of 25 seconds we should be able to ensure the currently running Fargate tasks are not receiving new tasks when we receive the SIGKILL, so we would effectively no longer terminate in the middle of task execution. This does not gracefully shut down the running tasks. The end-goal is to provide a more reliable and stable user experience. This will ultimately improve customer satisfaction and trust in the application.

User Story(ies)

As a DevOps Engineer
I want tasks to not receive work when they shouldn't So that we avoid losing data when tasks are terminated

Additional Info and Resources

Engineering Checklist

Acceptance Criteria

Additional Notes

All the documentation seen revolved around ECS, but we use Fargate within ECS. First thing should be making sure we have that level of control over Fargate. Also, ensure an understanding of how Fargate shuts down. The graceful shutdown doc was specific to ECS and we need to make sure it's applicable to Fargate as well.

QA Considerations

For QA to populate. Leave blank if QA is not applicable on this ticket.

Out of Scope

cris-oddball commented 1 year ago

@ldraney @k-macmillan How will we know this criteria has been met?

Deregistration delay is setup and we do not see new requests on those deregistered tasks The new tasks should start receiving requests before the old tasks are terminated

Are there specific log files we should not be seeing anymore?