TritonDataCenter / sdc-docker

Docker Engine for Triton
Mozilla Public License 2.0
183 stars 49 forks source link

Docker not coming up. registrar cannot mount data volume #63

Closed matthiasg closed 8 years ago

matthiasg commented 8 years ago

i am trying to get Triton CoaL working but its quite the hassle :)

right now i am stuck at getting a connection to Docker.

At first i was just following https://www.joyent.com/blog/test-drive-joyents-elastic-container-infrastructure-for-docker but after the step ./docker-client-env root@10.88.88.200 i get this:

CoaL >: `./docker-client-env root@10.88.88.200`
Password:
# Setting DOCKER_HOST=tcp://10.88.88.7:2375
# Unsetting DOCKER_TLS_VERIFY
CoaL >: docker info
Cannot connect to the Docker daemon. Is the docker daemon running on this host?

export shows:

...
declare -x DOCKER_HOST="tcp://10.88.88.7:2375"
declare -x DOCKER_TLS_VERIFY=""

Then i followed the README in https://github.com/joyent/sdc-docker which i had used successfully on my other machines to connect to the joyent public cloud.

So i ran:

CoaL >: ./sdc-docker-setup.sh coal mgoetzke ~/.ssh/id_rsa
Password:
Setting up Docker client for SDC using:
    CloudAPI:        https://10.88.88.5
    Account:         mgoetzke
    Key:             /Users/matthias/.ssh/id_rsa

If you have a pass phrase on your key, the openssl command will
prompt you for your pass phrase now and again later.

Verifying CloudAPI access.
CloudAPI access verified.

Generating client certificate from SSH private key.
Wrote certificate files to /Users/matthias/.sdc/docker/mgoetzke

Get Docker host endpoint from cloudapi.
Docker service endpoint is: tcp://10.88.88.7:2376
CoaL >: ping 10.88.88.7
PING 10.88.88.7 (10.88.88.7): 56 data bytes
64 bytes from 10.88.88.7: icmp_seq=0 ttl=255 time=11.758 ms
^C
--- 10.88.88.7 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 11.758/11.758/11.758/0.000 ms
CoaL >:

But the env file is not created (Edit: Its not created because an error is swallowed into /dev/null again in the script).

One more thing, the commands

sdcadm experimental portolan
sdcadm experimental fabrics --coal

return unkown command

Can anybody tell me what steps to actually follow today ? or what i can diagnose to get this up ?

matthiasg commented 8 years ago

just had the idea of running sdc-healthcheck again .. previously it always showed everything as status online, now it says cloudapi error, vmampi error,docker svc-err. these mostly vanished upon rebooting though.

that undoubtedly plays a part, just how to proceed ?

matthiasg commented 8 years ago

Ok docker has issues because svc:/manta/application/registrar:default is not running

[root@headnode (coal-1) ~]# svcs -x -z $(sdc-vmname docker)
svc:/smartdc/mdata:execute (Joyent SDC metadata handler)
  Zone: 628f47a8-1e2a-473a-8262-15d50847e1ef
 Alias: docker0
 State: maintenance since January 22, 2016 06:54:54 AM UTC
Reason: Start method exited with $SMF_EXIT_ERR_FATAL.
   See: http://illumos.org/msg/SMF-8000-KS
   See: /zones/628f47a8-1e2a-473a-8262-15d50847e1ef/root/var/svc/log/smartdc-mdata:execute.log
Impact: 1 dependent service is not running.  (Use -v for list.)
[root@headnode (coal-1) ~]# svcs -x -z $(sdc-vmname docker) -v
svc:/smartdc/mdata:execute (Joyent SDC metadata handler)
  Zone: 628f47a8-1e2a-473a-8262-15d50847e1ef
 Alias: docker0
 State: maintenance since January 22, 2016 06:54:54 AM UTC
Reason: Start method exited with $SMF_EXIT_ERR_FATAL.
   See: http://illumos.org/msg/SMF-8000-KS
   See: /zones/628f47a8-1e2a-473a-8262-15d50847e1ef/root/var/svc/log/smartdc-mdata:execute.log
Impact: 1 dependent service is not running:
        svc:/manta/application/registrar:default
matthiasg commented 8 years ago

in the docker zone it says:

2016-01-22T08:40:31Z] /opt/smartdc/boot/lib/util.sh:225: _sdc_enable_cron(): svccfg import /lib/svc/manifest/system/cron.xml
[2016-01-22T08:40:31Z] /opt/smartdc/boot/lib/util.sh:226: _sdc_enable_cron(): svcadm enable cron
[[2016-01-22T08:40:31Z] /opt/smartdc/boot/setup.sh:41: zonename
[2016-01-22T08:40:31Z] /opt/smartdc/boot/setup.sh:41: zfs set mountpoint=/data zones/628f47a8-1e2a-473a-8262-15d50847e1ef/data
cannot open 'zones/628f47a8-1e2a-473a-8262-15d50847e1ef/data': dataset does not exist
+ '[' 1 -gt 0 ']'
+ user_script_exit=95
+ exit 95
[ Jan 22 08:40:31 Method "start" exited with status 95. ]

zones/628f47a8-1e2a-473a-8262-15d50847e1ef is the docker zone itself and in the global zone it does show the zones/628f47a8-1e2a-473a-8262-15d50847e1ef/data folder (empty).

so now what ?

matthiasg commented 8 years ago

ok. had to run sdcadm experimental update-docker --servers cns,headnode again (was already done more than once). this time it added the missing dataset for the mountpoint script. after rebooting the docker zone all services come up, the sdc-docker-setup.sh script runs through correctly, the env.sh script exists and docker info finally works ! yeah :)

i am going to repeat this setup procedure a few times on different machines because it seemed to me that there were a number of little issues which were only solved with retrying,re-running or rebooting the machine. hopefully the overall stability and reproducibility is not actually that bad.