Closed maxpoulain closed 7 months ago
Thanks for the clear report. I appreciate it.
I did not think to test spot instances as I worked on #362. Apologies.
I would welcome a PR to fix this issue. However, I would not want to re-introduce tagging as a separate step, as you proposed in step (4) above. This recreates the original problem of cluster nodes becoming orphaned if something goes wrong between when they are launched and when they are tagged.
We want tagging to happen as part of the same AWS API call that creates the instances, whether they are spot or not.
Is that not possible with spot instances? i.e. Are tags for spot instance requests not automatically also tags for the spot instances themselves?
Hi,
Ok it's more clear now why you don't use create_tags
anymore.
But when we tried to launch spot instances with TagSpecifications
using request_spot_instances
, the request is tagged but instances doesn't inherit from tags of the request. So it seems, it's not possible to tag spot instances during the same call that creates instances.
Looks like the boto3 method we use to launch spot instances is deprecated, and the solution is to migrate off that method.
We strongly discourage using the RequestSpotInstances API because it is a legacy API with no planned investment.
Oh actually I hadn't seen that ! If we follow AWS recommandation to launch spot instances Which is the best Spot request method to use?, it seems that we need to use create_fleet. I don't know exactly what is the impact of this change.. But regarding our Tag problem, it seems that create_fleet
support instance tagging, so we could have a single AWS API call that launches the instances and tag them as described in the documentation:
TagSpecifications (list) –
The key-value pair for tagging the EC2 Fleet request on creation. For more information, see [Tag your resources](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html#tag-resources).
If the fleet type is instant, specify a resource type of fleet to tag the fleet or instance to tag the instances at launch.
If the fleet type is maintain or request, specify a resource type of fleet to tag the fleet. You cannot specify a resource type of instance. To tag instances at launch, specify the tags in a [launch template](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-launch-templates.html#create-launch-template).
I thought create_instances
is the API we need to use, actually. We are already using it to launch on-demand instances. It just needs different parameters to configure it for spot instances. That way we are using a single launch API across Flintrock. (Note that RunInstances
from the AWS docs maps to create_instances
in boto3.)
That was my plan, but I only took a quick look and plan to come back to this later. I think in the future we could migrate from create_instances
to create_fleet
if we want to support launching clusters with different master/slave instance types, but that's not required for now.
If you want to take a crack at this, go ahead. Otherwise, I will try to get to this next week.
Hi !
It seems that there is an issue regarding
TagSpecifications
inMake finding cluster nodes more reliable in the face of launch failures
. As we are interested in this new feature, we tried to make a release in our forked version but we got this issue:If we are not wrong aout this issue, it seems that this issue comes from
TagSpecifications
incommon_launch_specs
that is a parameter ofrequest_spot_instances
and not a key insideLaunchSpecification
dict as you can see here. Also, we can't putResourceType': 'instance'
inclient.request_spot_instances
as explained in the documentation:So, I think, we need to put
ResourceType': 'spot-instances-request'
if we want to tag the spot request instance (not spot instance), if not we can removeTagSpecifications
from spot request instance.Here is a possible solution in
_create_instances
:TagSpecifications
fromcommon_launch_specs
TagSpecifications=[{'ResourceType': 'spot-instances-request', 'Tags': tag_specifications}],
inclient.request_spot_instances
TagSpecifications=[{'ResourceType': 'instance', 'Tags': tag_specifications}],
inec2.create_instances
create_tags
as last step insideif spot_price:
:We have done these modifications in our fork, so if you're okay with that, we can do the same here or if you have maybe a more elegant way to do that it could be interesting.
I hope it's clear and this helps !
Maxime