spotty-cloud / spotty

Training deep learning models on AWS and GCP instances
https://spotty.cloud
MIT License
493 stars 43 forks source link

[feature request] Self destruction command on server #58

Closed Tarang closed 4 years ago

Tarang commented 4 years ago

If there was a way to run spotty stop on the remote machine and a way to clean up after itself that would be quite a useful feature.

Additionally if there was a way to delete s3 buckets, snapshots and any roles created that could help any lingering cloud charges too.

apls777 commented 4 years ago

You could just shut down the instance with the shutdown command. Or do you want to create snapshots as well?

Yes, the functionality to clean up all the resources created by Spotty is already in the todo list :). Thanks.

Tarang commented 4 years ago

shutdown could work, the issue is spotty run runs in the docker container and there isn't a way to pipe a command up to the host machine.

I was trying to find a way to avoid the snapshots as they use up space, is there a command line switch to avoid them or to delete all of them?

apls777 commented 4 years ago

If you want to shut down the instance manually from the terminal, you can do it from the host OS. You can either create a new tmux pane with the Ctrl+b, then c combination of keys (you will not be attached to the container), or connect to the host OS using the spotty ssh -H command.

If you want to do it at a particular time or using some other logic, you can put the shutdown -h +<minutes> command, or any custom script to the commands parameter (instances[].commads, not container.commands).

To delete all the created EBS volumes once the spotty stop command is called, just use the delete deletion policy in the spotty.yaml for all the volumes (read more here: https://spotty.cloud/docs/aws-provider/volumes-and-deletion-policies/). And again, unfortunately, you cannot do it from the instance right now, only from the local machine.

apls777 commented 4 years ago

It would be interesting to hear about your use-case. Why using the spotty stop command from the local machine is not convenient for you?

Tarang commented 4 years ago

I have a script that creates an instance with spotty create, then spotty run where it runs an enclosed command, my script manages its own state so no need to worry about saving snapshots, and then was finally hoping when it was done it could silently clean up after itself.

This way I can have a single command that can perform a certain task, my project can manage itself and upload its outputs on its own, and then the instance is terminated.

spotty ssh -H is great but it has issues in the sense there isn't a way to send a script up to the host container in the way spotty ssh -H or via piping via ssh, e.g spotty ssh -t COMMAND because of tmux.

On a side note, I tried a couple of related updates (https://github.com/Inculus/spotty/commit/8e6f70ca91a09f6ec014c8b8491de1362aad4b46) but I can't get piping into ssh to work (by removing the tmux layer)

class SshCommand(AbstractConfigCommand):

    name = 'ssh'
    description = 'Connect to the running Docker container or to the instance itself'

    def configure(self, parser: ArgumentParser):
        super().configure(parser)
        parser.add_argument('-H', '--host-os', action='store_true', help='Connect to the host OS instead of the Docker '
                                                                         'container')
        parser.add_argument('-s', '--session-name', type=str, default=None, help='tmux session name')
        parser.add_argument('-b', '--bash', action='store_true', help='use bash instead of tmux')
        parser.add_argument('-q', '--quit-on-exit', action='store_true', help='quit on exit')
        parser.add_argument('-l', '--list-sessions', action='store_true', help='List all tmux sessions managed by the '
                                                                               'instance')

    def _run(self, instance_manager: AbstractInstanceManager, args: Namespace, output: AbstractOutputWriter):
        if args.list_sessions:
            remote_cmd = ['tmux', 'ls', ';', 'echo', '']
        else:
            # tmux session name
            session_name = args.session_name
            if not session_name:
                session_name = 'spotty-ssh-host-os' if args.host_os else 'spotty-ssh-container'

            # Bash mode
            if not args.bash:
                print("Tmux mode")
                # a command to connect to the host OS or to the container
                remote_cmd = ['tmux', 'new', '-s', session_name, '-A']
                if not args.host_os:
                    # connect to the container or keep the tmux window in case of a failure
                    container_cmd = subprocess.list2cmdline(['sudo', '/tmp/spotty/instance/scripts/container_bash.sh'])
                    if not args.quit_on_exit:
                      tmux_cmd = '%s || tmux set remain-on-exit on' % container_cmd
                    else:
                      tmux_cmd = '%s || tmux set remain-on-exit off' % container_cmd

                    remote_cmd += [tmux_cmd]
            else:
                print("Bash mode")
                remote_cmd = []
                container_cmd = subprocess.list2cmdline(['sudo', '/tmp/spotty/instance/scripts/container_bash.sh'])
                remote_cmd += [f"""{container_cmd}"""]

        remote_cmd = subprocess.list2cmdline(remote_cmd)

        # connect to the instance
        ssh_command = get_ssh_command(instance_manager.get_ip_address(), instance_manager.ssh_port,
                                      instance_manager.ssh_user, instance_manager.ssh_key_path, remote_cmd)

        print(' '.join(ssh_command))
        subprocess.call(ssh_command)
apls777 commented 4 years ago

I think in your case the easiest thing you can do is to use AWS CLI or AWS API to find the instance and terminate it once your script is finished. Would it work for you?

apls777 commented 4 years ago

Actually, you can just install spotty inside the container and use spotty stop to terminate the instance. But it won't apply deletion policies for the volumes as it's being done once the instance is terminated.

Tarang commented 4 years ago

I thought a bit about this issue. spotty run should allow a -H flag or something similar to let a script run on the host instead of the docker container.

This way not only could the host be shut down, but related issues could also be solved.

apls777 commented 4 years ago

Thanks @Tarang, I actually had the same thought recently. I'll add this feature with the next release.

But it's still not clear to me how it will solve your problem. Your script should work inside the container, right? So if you use -H, you can shut down the instance at the end of the script, but it will be running outside of the container. Are you going to check the name of the container and use the docker run command?

Tarang commented 4 years ago

I haven't found a way to do that in a simple way yet. I potentially found a way that funnels commands up to the host system but its very tedious to use.

The way I use it is I have all these tasks and I want it to run the task by running a shell script (which would include creating an instance and running a command), then finish off the task on the remote machine and gracefully terminate the instance when its done - even if the laptop isn't connected anymore. The task could take well over a few days so it needs to be untethered from the machine that created it.

apls777 commented 4 years ago

Why you don't like the idea to use AWS API to terminate it? You can just install Spotty inside your container and use the spotty stop -c <path_to_config> command. It will find your instance and terminate it.

apls777 commented 4 years ago

@Tarang I'm closing this issue. Feel free to reopen if you have any further questions.