osism / cloud-in-a-box

Cloud in a box
https://osism.github.io/docs/guides/deploy-guide/examples/cloud-in-a-box
Apache License 2.0
17 stars 4 forks source link

Installation fails because osismclient container is not ready #159

Closed scoopex closed 10 months ago

scoopex commented 10 months ago

The current state of the CiaB fails with the following messages in the installation log:

$ tail -30 /var/log/install-cloud-in-a-box.log
bootstrap | Already on 'main'
bootstrap | Cloning into '/root/.ansible/tmp/ansible-local-746563khsaot3/tmp7gno6n7l/ansible-playbooks-manageromrv3lu4'...
bootstrap | Already on 'main'
bootstrap | + export INSTALL_ANSIBLE_ROLES=false
bootstrap | + INSTALL_ANSIBLE_ROLES=false
bootstrap | + ./run.sh network
bootstrap | + netplan apply
bootstrap | 
bootstrap | ** (generate:75443): WARNING **: 11:26:30.194: Permissions for /etc/netplan/01-osism.yaml are too open. Netplan configuration should NOT be accessible by others.
bootstrap | WARNING:root:Cannot call Open vSwitch: ovsdb-server.service is not running.
bootstrap | 
bootstrap | ** (process:75441): WARNING **: 11:26:30.501: Permissions for /etc/netplan/01-osism.yaml are too open. Netplan configuration should NOT be accessible by others.
bootstrap | 
bootstrap | ** (process:75441): WARNING **: 11:26:30.688: Permissions for /etc/netplan/01-osism.yaml are too open. Netplan configuration should NOT be accessible by others.
bootstrap | 
bootstrap | ** (process:75441): WARNING **: 11:26:30.688: Permissions for /etc/netplan/01-osism.yaml are too open. Netplan configuration should NOT be accessible by others.
bootstrap | + ./run.sh bootstrap
bootstrap | + chmod o+rw /var/run/docker.sock
bootstrap | + ./run.sh configuration
bootstrap | + find /opt/configuration -type f -exec sed -i s/eno1/eno1np0/g '{}' +
bootstrap | + [[ sandbox == \e\d\g\e ]]
bootstrap | + ./run.sh traefik
bootstrap | + [[ sandbox == \s\a\n\d\b\o\x ]]
bootstrap | + ./run.sh netbox
bootstrap | + ./run.sh manager
bootstrap | + popd
bootstrap | + osism apply facts
bootstrap | Error response from daemon: No such container: osismclient
| 
bootstrap | TASK [osism.services.manager : Wait for an healthy manager service] ************
bootstrap | ok: [manager.systems.in-a-box.cloud]
bootstrap | 
bootstrap | TASK [osism.services.manager : Copy osismclient bash completion script] ********
bootstrap | fatal: [manager.systems.in-a-box.cloud]: FAILED! => {"changed": false, "cmd": "INTERACTIVE=false osism complete >> /etc/bash_completion.d/osism", "delta": "0:00:00.030738", "end": "2023-11-06 11:28:58.888932", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2023-11-06 11:28:58.858194", "stderr": "Error response from daemon: No such container: osismclient", "stderr_lines": ["Error response from daemon: No such container: osismclient"], "stdout": "", "stdout_lines": []}
bootstrap | 
bootstrap | PLAY RECAP *********************************************************************
bootstrap | manager.systems.in-a-box.cloud : ok=28   changed=14   unreachable=0    failed=1    skipped=3    rescued=0    ignored=0   
bootstrap | 
bootstrap | /
deploy | + set -e
deploy | + export INTERACTIVE=false
deploy | + INTERACTIVE=false
deploy | + [[ -e /etc/cloud-in-a-box.env ]]
deploy | + CLOUD_IN_A_BOX_TYPE=sandbox
deploy | + echo CLOUD_IN_A_BOX_TYPE=sandbox
deploy | + sudo tee /etc/cloud-in-a-box.env
deploy | + [[ sandbox == \e\d\g\e ]]
deploy | + osism apply facts
deploy | Error response from daemon: No such container: osismclient
deploy | CLOUD_IN_A_BOX_TYPE=sandbox

A manual retry seems to help. therefore it might be a good idea to wait for the "osismclient" in "bootstrap.sh".

scoopex commented 10 months ago

Started to optimize, debug an probably fix that at: https://github.com/osism/cloud-in-a-box/tree/Installation_fails_because_osismclient_container_is_notready_159

fdobrovolny commented 10 months ago

I'm afraid the fix did not work:

bootstrap | 
bootstrap | TASK [osism.services.manager : Start/enable manager service] *******************
bootstrap | changed: [manager.systems.in-a-box.cloud]
bootstrap | 
bootstrap | TASK [osism.services.manager : Include initialize tasks] ***********************
bootstrap | included: /root/.ansible/collections/ansible_collections/osism/services/roles/manager/tasks/initialize.yml for manager.systems.in-a-box.cloud
bootstrap | 
bootstrap | TASK [osism.services.manager : Flush handlers] *********************************
bootstrap | 
bootstrap | RUNNING HANDLER [osism.services.manager : Restart manager service] *************
bootstrap | changed: [manager.systems.in-a-box.cloud]
bootstrap | 
bootstrap | RUNNING HANDLER [osism.services.manager : Register that manager service was restarted] ***
bootstrap | ok: [manager.systems.in-a-box.cloud]
bootstrap | 
bootstrap | RUNNING HANDLER [osism.services.manager : Wait for an healthy manager service] ***
bootstrap | changed: [manager.systems.in-a-box.cloud]
bootstrap | 
bootstrap | RUNNING HANDLER [osism.services.manager : Copy osismclient bash completion script] ***
bootstrap | fatal: [manager.systems.in-a-box.cloud]: FAILED! => {"changed": true, "cmd": "INTERACTIVE=false osism complete > /etc/bash_completion.d/osism", "delta": "0:00:00.014676", "end": "2023-11-12 14:38:18.355874", "failed_when_result": true, "msg": "non-zero return code", "rc": 1, "start": "2023-11-12 14:38:18.341198", "stderr": "Error response from daemon: No such container: osismclient", "stderr_lines": ["Error response from daemon: No such container: osismclient"], "stdout": "", "stdout_lines": []}
bootstrap | 
bootstrap | PLAY RECAP *********************************************************************
bootstrap | manager.systems.in-a-box.cloud : ok=32   changed=17   unreachable=0    failed=1    skipped=3    rescued=0    ignored=0   
bootstrap | 
bootstrap | /opt/cloud-in-a-box
bootstrap | Checking if container 'osismclient' is running
bootstrap | 2/60 - Waiting for container 'osismclient' to be running
bootstrap | 3/60 - Waiting for container 'osismclient' to be running
bootstrap | 4/60 - Waiting for container 'osismclient' to be running
bootstrap | BOOTSTRAP FAILED

The script just died afterwards.

berendt commented 10 months ago

It depends on the environment. I am currently running a deployment here and it is working.

berendt commented 10 months ago

Probably it depends on the Docker version. I just prepared 2 more PRs to try to make the deployment more stable.

scoopex commented 10 months ago

Fixed