rancher / os

Tiny Linux distro that runs the entire OS as Docker containers
https://rancher.com/docs/os/v1.x/en/
Apache License 2.0
6.44k stars 655 forks source link

Cloud-init script issues #1106

Closed drpebcak closed 8 years ago

drpebcak commented 8 years ago

RancherOS Version: (ros os version) v0.5.0 Where are you running RancherOS? (docker-machine, AWS, GCE, baremetal, etc.) AWS

I am trying to use userdata to run a script when I start new instances. However, the script doesn't seem to execute, and my instance is unreachable via ssh (going on about 12 hours now).

Even a simple script like this one doesn't work:

#!/bin/bash
sudo docker run -d --restart=always -p 8080:8080 rancher/server
ispyinternet commented 8 years ago

Not sure if this is helpful:

/opt/rancher/bin/start.sh <- runs before docker has started /etc/rc.local <- runs after docker has started - although scripts needs to check docker has finished loading (e.g. check for docker info) if you are requiring use of docker.

(and you need to write these scripts from your cloud-config.yml)

deniseschannon commented 8 years ago

I am unable to reproduce.

I started a AWS instance using this simple script

#!/bin/bash
touch /opt/the-script-has-run

I was able to SSH into the instance using the key pair provided in the UI and able to see that the script ran.

Can you try rebooting your box and see if you can ssh in after that ?

Or can you get an "instance screenshot" of it?

drpebcak commented 8 years ago

I think @ispyinternet has the right idea. My script is trying to run docker commands and failing.

drpebcak commented 8 years ago

@deniseschannon the instance screenshot just shows 'Booting Kernel'

I actually just terminated the instance to try using cloud-config to put something in rc.local instead, but I can launch another one if you think it is worthwhile. My guess is that it is because I'm trying to use docker commands before the daemon is loaded.

joshwget commented 8 years ago

The "Booting Kernel" message happens really early in the boot process, so that's probably not related to cloud-config. If you don't mind, I do think trying again would be worthwhile. It seems odd to be that an error in the script execution would affect the system in anyway.

whiteadam commented 8 years ago

My 2 cents: I had a similar problem and just used a sleep to fix it, but... it's not ideal.

And as @drpebcak mentioned on IRC, it may not be good to use this with reboots. (same issue with 0.4.5 and 0.5.0).

#cloud-config
write_files:
  - path: /etc/rc.local
    permissions: "0755"
    owner: root
    content: |
      #!/bin/bash
      sleep 300
      sudo bash /etc/rancher-ha.sh
  - path: /etc/rancher-ha.sh
    permissions: "0755"
    owner: root
    content: |
      #!/bin/sh
      set -e
      umask 077
      IMAGE=$1
      if [ "$IMAGE" = "" ]; then
        IMAGE=rancher/server
      fi
      mkdir -p /var/lib/rancher/etc/server
      mkdir -p /var/lib/rancher/etc/ssl
      mkdir -p /var/lib/rancher/bin
      echo Creating /var/lib/rancher/etc/server.conf
      cat > /var/lib/rancher/etc/server.conf << EOF
      export CATTLE_HA_CLUSTER_SIZE=3
      export CATTLE_HA_HOST_REGISTRATION_URL=https://rancher.example.com
      export CATTLE_HA_CONTAINER_PREFIX=rancher-ha-
      export CATTLE_DB_CATTLE_MYSQL_HOST=asff2234fsdfc.cjfwiboc8zbz.us-east-1.rds.amazonaws.com
      export CATTLE_DB_CATTLE_MYSQL_PORT=3306
      export CATTLE_DB_CATTLE_MYSQL_NAME=rancher
      export CATTLE_DB_CATTLE_USERNAME=rancher
      export CATTLE_DB_CATTLE_PASSWORD=asdfsadf42534tg:fe73480876e3a18993f0f5c099f99d4c
      export CATTLE_HA_PORT_REDIS=6379
      export CATTLE_HA_PORT_SWARM=2376
      export CATTLE_HA_PORT_HTTP=80
      export CATTLE_HA_PORT_HTTPS=443
      export CATTLE_HA_PORT_PP_HTTP=81
      export CATTLE_HA_PORT_PP_HTTPS=444
      export CATTLE_HA_PORT_ZK_CLIENT=2181
      export CATTLE_HA_PORT_ZK_QUORUM=2888
      export CATTLE_HA_PORT_ZK_LEADER=3888
      # Uncomment below to force HA enabled and not require one to set it in the UI
      export CATTLE_HA_ENABLED=true
      EOF
      echo Creating /var/lib/rancher/etc/server/encryption.key
      if [ -e /var/lib/rancher/etc/server/encryption.key ]; then
        mv /var/lib/rancher/etc/server/encryption.key /var/lib/rancher/etc/server/encryption.key.`date '+%s'`
      fi
      cat > /var/lib/rancher/etc/server/encryption.key << EOF
      eR9azJYlzNipjhfdiausiofjjs=
      EOF
      echo Creating /var/lib/rancher/bin/rancher-ha-start.sh
      cat > /var/lib/rancher/bin/rancher-ha-start.sh << "EOF"
      #!/bin/sh
      set -e
      IMAGE=rancher/server:v1.1.0
      if [ "$IMAGE" = "" ]; then
        echo Usage: $0 DOCKER_IMAGE
        exit 1
      fi
      docker rm -fv rancher-ha >/dev/null 2>&1 || true
      ID=`docker run --restart=always -d -v /var/run/docker.sock:/var/run/docker.sock --name rancher-ha --net host --privileged -v /var/lib/rancher/etc:/var/lib/rancher/etc $IMAGE ha`
      echo Started container rancher-ha $ID
      echo Run the below to see the logs
      echo
      echo docker logs -f rancher-ha
      EOF
      chmod +x /var/lib/rancher/bin/rancher-ha-start.sh
      echo Running: /var/lib/rancher/bin/rancher-ha-start.sh $IMAGE
      echo To re-run please execute: /var/lib/rancher/bin/rancher-ha-start.sh $IMAGE
      exec /var/lib/rancher/bin/rancher-ha-start.sh $IMAGE

I also have issues with different scripts, like automatically creating RancherOS hosts with SpotInst.

drpebcak commented 8 years ago

@deniseschannon, @joshwget I'm spinning up a new host, rancherOS v0.5.0, and passing it this as userdata:

#!/bin/bash
sudo docker run -d --restart=always -p 8080:8080 rancher/server

I've verified that an instance with the same settings (but without userdata) is accessible.

I will wait a few minutes to see if I can SSH into the new instance.

drpebcak commented 8 years ago

Hmm.. and now I can't seem to replicate it. It definitely doesn't start rancher-server, but it isn't freezing up the instance anymore.

I'm going to close this because it seems like it's invalid.

sriman commented 6 years ago

Copy & paste in the user data of aws instance

cloud-config

write_files:

stephen-dahl commented 4 years ago

Official way to deal with this: https://rancher.com/docs/os/v1.2/en/configuration/running-commands/#running-docker-commands

#cloud-config
rancher:
write_files:
  - path: /etc/rc.local
    permissions: "0755"
    owner: root
    content: |
      #!/bin/bash
      wait-for-docker
      docker run -d nginx
stephen-dahl commented 4 years ago

looks like there is also https://rancher.com/docs/os/v1.x/en/installation/system-services/custom-system-services/#launching-services-through-cloud-config

#cloud-config
rancher:
  services:
    nginxapp:
      image: nginx
      restart: always