diux-dev / cluster

train on AWS
75 stars 15 forks source link

Change Spot Interruption strategy to Stop #27

Open yaroslavvb opened 6 years ago

yaroslavvb commented 6 years ago

Using settings from https://github.com/diux-dev/cluster/commit/1c113ee8531471b2ffd5bd731f95cc2b925fa517, things fail botocore.exceptions.ClientError: An error occurred (InvalidParameterCombination) when calling the RequestSpotInstances operation: The request with type 'null' is not supported when instanceInterruptionBehavior is set to 'stop'.

However, AWS suggests it's possible with some extra settings https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/spot-interruptions.html

cc @bearpelican

yaroslavvb commented 6 years ago

Above happens because the action taken for a Spot Instance interruption depends on the request type (one-time or persistent) and the interruption behavior (hibernate, stop, or terminate). A persistent Spot Instance request remains active until it expires or you cancel it, even if the request is fulfilled. If the Spot price exceeds your maximum price or capacity is not available, your Spot Instance is interrupted. After your instance is interrupted, when the maximum price exceeds the Spot price or capacity becomes available again, the Spot Instance is started (if stopped), the Spot Instance is resumed (if hibernated), or the Spot Instance request is opened again and Amazon EC2 launches a new Spot Instance (if terminated). In summary, 'InstanceInterruptionBehavior' as 'Stop' will take effect only when the Spot Instance request is persistent.