spotty-cloud / spotty

Training deep learning models on AWS and GCP instances
https://spotty.cloud
MIT License
492 stars 43 forks source link

Stack "spotty-instance-tacotron-i1" was not created. #61

Closed nosound2 closed 4 years ago

nosound2 commented 4 years ago

I try to run Tacotron example following the Medium article, but got an error

Creating IAM role for the instance...
Preparing CloudFormation template...
  - volume "tacotron-i1-workspace" will be created
  - volume "tacotron-i1-docker" will be created
  - availability zone: auto
  - maximum Spot Instance price: on-demand
  - AMI: "Deep Learning Base AMI (Ubuntu) Version 19.2" (ami-07ab02efef2ffe373)
  - Docker data will be stored on the "docker" volume

Volumes:
+-----------+---------------+------------+-----------------+
| Name      | Container Dir | Type       | Deletion Policy |
+===========+===============+============+=================+
| workspace | /workspace    | EBS volume | Retain Volume   |
+-----------+---------------+------------+-----------------+
| docker    | -             | EBS volume | Retain Volume   |
+-----------+---------------+------------+-----------------+

Waiting for the stack to be created...
  - launching the instance...
Error:
------
Stack "spotty-instance-tacotron-i1" was not created.
Please, see CloudFormation logs for the details.

Where do I find CloudFormation logs?

I am using Windows 10 with Git bash (MinGW64).

And another question, I cloned the tacotron repo but needed to rename Dockerfile to Dockerfile.spotty. Why is it not named correctly in the repo? I worry if I do things correctly.

nosound2 commented 4 years ago

I found a closed issue with where to find CloudFormation logs, this is what I have there

Max spot instance count exceeded (Service: AmazonEC2; Status Code: 400; `
Error Code: MaxSpotInstanceCountExceeded; 
Request ID: 0425df51-b33d-4e56-9441-a4a9ad5bc338)
--
nosound2 commented 4 years ago

I forgot to create requirements-spotty.txt (again I am puzzled why it is not in the docker directory already?), now it is a different error

There is no Spot capacity available that matches your request. 
(Service: AmazonEC2; Status Code: 500; Error Code: InsufficientInstanceCapacity)

and if I add onDemandInstance: true it is

You have requested more vCPU capacity than your current vCPU limit of 0 allows for the instance bucket that the specified instance type belongs to. 
Please visit http://aws.amazon.com/contact-us/ec2-request to request an adjustment to this limit. 
(Service: AmazonEC2; Status Code: 400; Error Code: VcpuLimitExceeded)

Not sure now I am on the right path, requesting limit adjustment seems incorrect here. Can you please help with what to do here?

apls777 commented 4 years ago

I cloned the tacotron repo but needed to rename Dockerfile to Dockerfile.spotty I forgot to create requirements-spotty.txt (again I am puzzled why it is not in the docker directory already?), now it is a different error

The Rayhane-mamah/Tacotron-2 is not my repo and it doesn't contain changes that needed for training with Spotty. The point of that article was to show how to train an arbitrary model with Spotty.

There is no Spot capacity available that matches your request.

You can try to use another region where Spot instances are available. To check spot instance prices in particular regions, you can use the spotty aws spot-prices command:

$ spotty aws spot-prices -i p3.2xlarge -r eu-west-1
Getting spot instance prices for "p3.2xlarge"...

Price  Zone
0.9915 eu-west-1b
0.9915 eu-west-1c
3.3050 eu-west-1a

You have requested more vCPU capacity than your current vCPU limit of 0 allows Not sure now I am on the right path, requesting limit adjustment seems incorrect here. Can you please help with what to do here?

Apparently, new AWS accounts have very low limits for EC2 services. You can read how to increase them here: Amazon EC2 Service Limits .

apls777 commented 4 years ago

Closing the issue, but feel free to reopen it if you have more questions.