tinkerbell / tink

Workflow Engine for provisioning Bare Metal
https://tinkerbell.org
Apache License 2.0
930 stars 134 forks source link

The provisioner does not work after reboot #161

Closed rgl closed 4 years ago

rgl commented 4 years ago

After I reboot the provisioner machine it no longer works.

Before the reboot, these are the containers running (as displayed by docker ps):

deploy_boots_1
deploy_cacher_1
deploy_db_1
deploy_elasticsearch_1
deploy_fluentbit_1
deploy_hegel_1
deploy_kibana_1
deploy_nginx_1
deploy_registry_1
deploy_tink-cli_1
deploy_tink-server_1

But after reboot, only these are running:

deploy_elasticsearch_1
deploy_fluentbit_1
deploy_kibana_1
deploy_tink-cli_1

For reference, here's how all their state is right now:

# docker ps -a --format 'table {{.Status}}\t{{.Names}}' | sort
STATUS                    NAMES
Exited (0) 2 days ago     deploy_certs_1
Exited (128) 2 days ago   deploy_db_1
Exited (128) 2 days ago   deploy_nginx_1
Exited (128) 2 days ago   deploy_tink-server_1
Exited (2) 2 days ago     deploy_boots_1
Exited (2) 2 days ago     deploy_cacher_1
Exited (2) 2 days ago     deploy_hegel_1
Exited (2) 2 days ago     deploy_registry_1
Up About an hour          deploy_elasticsearch_1
Up About an hour          deploy_fluentbit_1
Up About an hour          deploy_kibana_1
Up About an hour          deploy_tink-cli_1

To get things working again I have to manually start the non-running ones with:

docker ps -a --format 'table {{.CreatedAt}}\t{{.Names}}\t{{.Status}}' | grep Exited | sort | awk '{print $5}' | xargs -I% docker start %
gianarb commented 4 years ago

This got fixed with the new setup.sh version and I can't reproduce it! Closing it for now