Closed kavicse87 closed 4 years ago
Adding to this, I'm bit confused because it says it's not a real spot interruption.. It's a mock spot interruption.. Even if it's a mock interruption then node termination handler will handle that interruption and drian the nodes right ? But how to connect it with the spot node is a question again ?
Hi @kavicse87 !
To connect aws-node-termination-handler (NTH) and AEMM together, you'll need to configure NTH to use a different Instance Metadata URL than the default 169.254.169.254 address.
Checkout the helm readme configuration table for NTH: https://github.com/aws/aws-node-termination-handler/blob/master/config/helm/aws-node-termination-handler/README.md
Recently, @brycahta converted AWS Node Termination Handler's end-to-end tests to use AEMM. You can checkout the e2e code that configure NTH to use AEMM rather than the real EC2 Instance Metadata Service here: https://github.com/aws/aws-node-termination-handler/blob/master/test/e2e/spot-interruption-test#L36\
AEMM can be deployed on one or multiple nodes in the cluster. It is accessed as a service so it doesn't matter if you have 1 or 100 copies installed on the cluster.
Since you would like to monitor to see what happens when the fake interruption is received, you'll need to setup a kubectl log tail on the NTH pod, or if you have prometheus configured, you can scrape NTH for metrics on interruptions.
To make it a little easier to catch the mock spot interruption in action, you can configure AEMM to delay the spot ITN for a period of time using the mockDelaySec
argument. Check out this section of the docs for an example: https://github.com/aws/amazon-ec2-metadata-mock/tree/master/helm/amazon-ec2-metadata-mock#user-content-installing-the-chart-with-overridden-values-for-aemm-configuration
Hope this helps!
Thanks @bwagner5 let me go through this and let you know..
@bwagner5 Please clarify.. If I configure the AEMM and connect it with node termination handler.. So the spot interruption, I meant the mock spot interruption that AEMM generates will be handled by AWS node termination handler and that will start the draining process of the spot node by any chance to see the pods are migrating to the other node's as expected ? This test is pending since the spot instance c5d.large not getting interrupted as we couldn't get any spot instance from the spot pool which is >20%.. Please advise how can I test this and see the actual functionality tested for the aws node termination handler..
@bwagner5 please advise.
@kavicse87 Configuring node termination handler to listen to AEMM rather than the actual EC2 Instance Metadata (169.254.169.254) will cause a mock interruption even which node-termination-handler will handle. There's not even a need for the node to be a spot instance that you are testing on.
Hi @kavicse87 -- closing this as it's been stale for over 2weeks.
Please re-open if you need support, Thanks!
@bwagner5 First of all thank you for introducing me AEMM. Please advise.
What is available with me ? I had to fix my kops cluster issues and now it's stable and it has been upgraded to the version where it supports this startergy. So you maybe reading it again I, apologize for that.
Sample stateless web application deployed on the nodes and it's running What is needed?
Highly helpful if you could give me the list of commands to connect with the spot instance with this AEMM
https://github.com/aws/amazon-ec2-metadata-mock/blob/master/docs/usage.md