Closed hackdna closed 8 years ago
When I have seen this before, it was an instance limit on the ec2 side. Can you confirm your c3.8xlarge account limit is high enough to launch an extra?
It turns out this is due to a very low EC2 instance limit on my account. I've tried adding one more c3.8xlarge
instance separately but got this error message in the CM log:
2016-03-15 15:43:33,344 ERROR connection:1202 400 Bad Request
2016-03-15 15:43:33,344 ERROR connection:1203 <?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>InstanceLimitExceeded</Code><Message>You have requested more instances (2) than your current instance limit of 1 allows for the specified instance type. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.</Message></Error></Errors><RequestID>247ef646-8144-4e37-8fe5-a4a086bf3eb1</RequestID></Response>
2016-03-15 15:43:33,344 ERROR ec2:514 EC2 response error when starting worker nodes: EC2ResponseError: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>InstanceLimitExceeded</Code><Message>You have requested more instances (2) than your current instance limit of 1 allows for the specified instance type. Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit.</Message></Error></Errors><RequestID>247ef646-8144-4e37-8fe5-a4a086bf3eb1</RequestID></Response>
It would be great if this were indicated in the web UI during the initial attempt.
Yep, that error should be represented better in the UI.
Also, nodes that have failed to launch are impossible to remove from the web UI using "Remove worker nodes":
They'll disappear automatically after a few minutes (occasionally, a page refresh is necessary).
OK, thanks. They were in that "blue" state for almost an hour but I didn't try to reload the page and I've just terminated the cluster.
There's no API for querying resource limits for the time being (https://forums.aws.amazon.com/thread.jspa?messageID=709583) so best we could do is to propagate the error message to the popup message but that seems a bit of an overkill? An error message in the info log might be enough?
Showing the exact error message in the info log would be great.
Actually, I'm not sure that wasn't the case already as all the info
, error
or critical
log messages should be getting included in the info log. I've cleaned it up now so only the message body gets shown, making it at least easier to read.
I'll close this issue for now and if we run into it again, we can reopen.
Using main
cloudman
bucket. Requested to add twoc3.8xlarge
workers using the "Add worker nodes" dialog box but only one was added:CM log: