spotty-cloud / spotty

Training deep learning models on AWS and GCP instances
https://spotty.cloud
MIT License
493 stars 43 forks source link

Stack was not created: DockerReadyWaitCondition #55

Closed benjaminvdb closed 5 years ago

benjaminvdb commented 5 years ago

Hi!

Spotty sounds like an excellent idea and I'm eager to try it out. I'm following this article step-by-step.

At the first step of model training, I'm running into an error after running spotty start (for the extended log, see bottom):

Stack "spotty-instance-tacotron-i1" was not created.
Please, see CloudFormation logs for the details.

CloudFormation is showing:

CREATE_FAILED | The following resource(s) failed to create: [DockerReadyWaitCondition].
Received FAILURE signal with UniqueId i-0a62be66c83b8b65d

I can run spotty run ssh, but this shows a tmux session with the message Pane is dead. I can close the window using ctrl-b x and execute commands from there. However, running spotty run <script> or spotty run -r <scripts> gives me the same dead tmux session and the script is not being executed. If possible, I'd like to run Spotty commands from my local machine as advertised. What's going on here and how can I solve it?

Thanks!

$ spotty start
Instance is already running. Are you sure you want to restart it?
Type "y" to confirm: y
Terminating the instance...
fskkSyncing the project with S3 bucket...
Preparing CloudFormation template...
  - volume "tacotron-i1-workspace" (vol-xxxx) will be attached
  - volume "tacotron-i1-docker" (vol-yyyy) will be attached
  - availability zone: eu-west-1c
  - maximum Spot Instance price: on-demand
  - AMI: "Deep Learning Base AMI (Ubuntu) Version 19.1" (ami-06c961cd3240cfd7c)
  - Docker data will be stored on the "docker" volume

Volumes:
+-----------+---------------+------------+-----------------+
| Name      | Container Dir | Type       | Deletion Policy |
+===========+===============+============+=================+
| workspace | /workspace    | EBS volume | Retain Volume   |
+-----------+---------------+------------+-----------------+
| docker    | -             | EBS volume | Retain Volume   |
+-----------+---------------+------------+-----------------+

Waiting for the stack to be deleted...
Waiting for the stack to be created...
  - launching the instance...
  - waiting for the Docker container to be ready...
Error:
------
Stack "spotty-instance-tacotron-i1" was not created.
Please, see CloudFormation logs for the details.
apls777 commented 5 years ago

Hi @benjaminvdb,

For some reason, your container failed to start. Please, check the logs on the running instance: see the "The instance is failed to start. Where can I find the logs? question in FAQ.

benjaminvdb commented 5 years ago

@apls777 Thank you for the incredibly fast reply!

I feel very stupid. I copied the requirements.txt file to docker without renaming to requirements-spotty.txt... Everything is working fine now! Thanks for your help.

Would it be possible to show logs from the Docker container on the local machine easily? The Heroku CLI has real-time tailing with heroku logs --tail, which is incredibly useful. This idea could work for Spotty as well using a similar approach (display a log file within a container using the tail command).

apls777 commented 5 years ago

There already was a similar feature request (#52), but maybe it's a better idea of how to pull the logs. Thanks!

Glad it's working for you now, I'm then closing the issue.