aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 317 forks source link

[ECS] [request]: de-register from Cloud Map / R53 when instance is draining #473

Open jespersoderlund opened 5 years ago

jespersoderlund commented 5 years ago

Tell us about your request I want ECS to better handle the state of DRAINING for tasks when it comes to using service discovery, in our case initiated by draining of EC2 instances

Which service(s) is this request for? ECS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? We are running software using service discovery that needs to be gracefully drained, the same way that is handled with having a load balancer and draining, ie removing from load balancing traffic when draining.

The current working of ECS is that it does not remove the current task from Cloud map when it is put in draining which is the desired behavior so that we can drain the traffic from the instance

What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem. If we cannot gracefully drain the traffic from our task, the clients of that service will get an unnecessary high rate of connections that just die, possibly mid-request. Forcing uncessary resync and checking whether request was received on the target-end.

Are you currently working around this issue? There is currently not any good work-around that we can find to this problem.

Additional context Anything else we should know?

Attachments If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

eedwards-sk commented 5 years ago

Does it get removed from CloudMap/Route53 once the task is no longer running?

jespersoderlund commented 4 years ago

Does it get removed from CloudMap/Route53 once the task is no longer running?

Yes, it does

shubharao commented 4 years ago

ECS service discovery does expect "intelligent" client libraries or a client side proxy to handle this well, unlike the model where this is a capability of the load balancer. Have you considered using something like AWS App Mesh or a proxy like Envoy on the client side?

tarunwadhwa13 commented 4 years ago

Seems like even app mesh is unable to handle this currently. Have reduces TTL for entry but still not able to reduce the errors. We are using gRPC in production and have seen stream errors even when the container goes down gracefully.

https://github.com/aws/aws-app-mesh-roadmap/issues/213

Paritosh-Anand commented 4 years ago

Is this a solvable problem ? would be glad if someone can share details on how this needs to be handled. Also curious to know that is it something which will be solved at App Mesh side or ECS service discovery side?

As this issue directly affects the end users...seems like a blocker to move application behind App Mesh.