Closed brodygov closed 6 years ago
Following up here with discussion from the PR:
This issue is actually more like a bug/wart in boto2, which doesn't set the AllowReassociation parameter and doesn't provide any way to explicitly set that parameter to false. Without it, AWS defaults to true, allowing an EIP to be reassociated even if it is currently associated with another instance.
https://github.com/ab/boto/commit/239013bfef349763ac9ef73cc67bba0e0eb4adda is a start at fixing this for boto 2.
I think the billing issue I said could theoretically exist in the PR thread does already exist.
Imagine two instances fire up due to a scaling event and both have a cron to check they have an Elastic IP every minute. Once booted they will both race for the same Elastic IP as shown in @brodygov's log.
With an EC2-Classic account (pre-2014) only one instance will manage to associate and the other will error. This will incur a $0.10 charge. In the second minute the second instance will associate and incur another $0.10 charge. Total charges are $0.20.
With an EC2-VPC-only account (2014+) both instances could successfully associate in the first minute and incur $0.20 of charges. In the second minute the instance that was stolen from will associate again and incur another $0.10 charge. Total charges are $0.30.
Not a big difference but it escalates quickly. If the bad luck was repeated with 10 instances it would take 10 minutes to properly associate every instance and the difference by then is $1.00 vs $5.50.
I would definitely upgrade to boto 3 and explicitly set AllowReassociation to False for everyone.
If requested features that would require reassociation (e.g. #13 or #19) are ever implemented then they should be done so carefully.
As this issue affected my use case, I went ahead and made a PR - If #13 or #19 are ever implemented, the implementers could add an argument for AllowReassociation.
The above PR is merged and released with 0.8.0. Please reopen this issue if needed.
Thanks @wilerson for the PR.
The EC2 API for associating EIPs appears to be eventually consistent. So if two servers attempt to grab the same EIP at the same time, one of them will succeed and the other will fail but think it succeeded.
Example logs from this happening, with the IP and instance IDs redacted.
In reality,
i-8d7ed1
won the race and has the EIP associated, whilei-a77fbf
thinks it got the EIP but has no EIP associated.One way to fix this would be to loop doing the describe-addresses call until the association with the current instance ID appears. Would you be interested in a PR that does this?