aws / amazon-ec2-metadata-mock

A tool to simulate Amazon EC2 instance metadata
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
Apache License 2.0
266 stars 44 forks source link

Need advise on using AEMM #78

Closed kavicse87 closed 4 years ago

kavicse87 commented 4 years ago

@bwagner5 First of all thank you for introducing me AEMM. Please advise.

What is available with me ? I had to fix my kops cluster issues and now it's stable and it has been upgraded to the version where it supports this startergy. So you maybe reading it again I, apologize for that.

  1. Spot instances as a kubernetes nodes are provisioned. I used spot Allocation: capacity optimized startergy.
  2. Attached the IG's with the cluster auto scaler
  3. aws-node-termination handler daemon set has been installed on the newly created spot nodes. I see the logs it has connected with the IMDVS2
  4. Sample stateless web application deployed on the nodes and it's running What is needed?

    Highly helpful if you could give me the list of commands to connect with the spot instance with this AEMM

    1. On same name space I should install this AEMM or any name space is fine ? Should I install this on the non spot instance which I'm not using for this testing right ?
    2. Command to connect a specific spot instance with AEMM?. So that my application will be migrated to the other spot instance which is available
    3. Command to interrupt the spot request / instance?
    4. How to check the spot interruption notice / any termination notice as such ?
    5. End goal is that I should see the logs related to spot interruption and node draining of the spot and stateless application migration to other available nodes in the AWS node termination handler before it get's terminated if my understanding is correct.

https://github.com/aws/amazon-ec2-metadata-mock/blob/master/docs/usage.md

kavicse87 commented 4 years ago

Adding to this, I'm bit confused because it says it's not a real spot interruption.. It's a mock spot interruption.. Even if it's a mock interruption then node termination handler will handle that interruption and drian the nodes right ? But how to connect it with the spot node is a question again ?

bwagner5 commented 4 years ago

Hi @kavicse87 !

To connect aws-node-termination-handler (NTH) and AEMM together, you'll need to configure NTH to use a different Instance Metadata URL than the default 169.254.169.254 address.

Checkout the helm readme configuration table for NTH: https://github.com/aws/aws-node-termination-handler/blob/master/config/helm/aws-node-termination-handler/README.md

Recently, @brycahta converted AWS Node Termination Handler's end-to-end tests to use AEMM. You can checkout the e2e code that configure NTH to use AEMM rather than the real EC2 Instance Metadata Service here: https://github.com/aws/aws-node-termination-handler/blob/master/test/e2e/spot-interruption-test#L36\

AEMM can be deployed on one or multiple nodes in the cluster. It is accessed as a service so it doesn't matter if you have 1 or 100 copies installed on the cluster.

Since you would like to monitor to see what happens when the fake interruption is received, you'll need to setup a kubectl log tail on the NTH pod, or if you have prometheus configured, you can scrape NTH for metrics on interruptions.

To make it a little easier to catch the mock spot interruption in action, you can configure AEMM to delay the spot ITN for a period of time using the mockDelaySec argument. Check out this section of the docs for an example: https://github.com/aws/amazon-ec2-metadata-mock/tree/master/helm/amazon-ec2-metadata-mock#user-content-installing-the-chart-with-overridden-values-for-aemm-configuration

Hope this helps!

kavicse87 commented 4 years ago

Thanks @bwagner5 let me go through this and let you know..

kavicse87 commented 4 years ago

@bwagner5 Please clarify.. If I configure the AEMM and connect it with node termination handler.. So the spot interruption, I meant the mock spot interruption that AEMM generates will be handled by AWS node termination handler and that will start the draining process of the spot node by any chance to see the pods are migrating to the other node's as expected ? This test is pending since the spot instance c5d.large not getting interrupted as we couldn't get any spot instance from the spot pool which is >20%.. Please advise how can I test this and see the actual functionality tested for the aws node termination handler..

kavicse87 commented 4 years ago

@bwagner5 please advise.

bwagner5 commented 4 years ago

@kavicse87 Configuring node termination handler to listen to AEMM rather than the actual EC2 Instance Metadata (169.254.169.254) will cause a mock interruption even which node-termination-handler will handle. There's not even a need for the node to be a spot instance that you are testing on.

brycahta commented 4 years ago

Hi @kavicse87 -- closing this as it's been stale for over 2weeks.

Please re-open if you need support, Thanks!