goharbor / harbor

An open source trusted cloud native registry project that stores, signs, and scans content.
https://goharbor.io
Apache License 2.0
23.73k stars 4.73k forks source link

OpsMgr 'Apply Changes' completed but the harbor job on 'Status' tab shows failure #3718

Closed jessehu closed 6 years ago

jessehu commented 6 years ago

When deploying Harbor Tile at the 1st time, OpsMgr 'Apply Changes' completed but the harbor job on 'Status' tab shows failure. After a time period, the harbor job on 'Status' tab shows success.

jessehu commented 6 years ago

The expected behaviour is harbor job is ready to service when 'Apply Changes' completed. Need to wait for the harbor service to be ready before starting harbor job exits. cc @steven-zou @reasonerjt

jessehu commented 6 years ago

I just found that "wait for harbor service to be ready" doesn't solve the problem. startHarbor() uses /sbin/start-stop-daemon to launch 'docker-compose up' and write its pid to harbor.pid, then Ops Mgr will detect harbor.pid is alive then consider all jobs (docker and harbor in our case) finished (rather than waiting for the 'ctl start' to exit), and exit the deployment with success message 'Change Applied'. However, if 'docker-compose up' can not start Harbor service successfully for some reason (I met this case due to MySQL can not start, but didn't find out the cause), the Harbor service will not able to accept requests, although Ops Mgr already reported the harbor deployment succeeded.

To resolve this problem, we might let the start-stop-daemon to save the docker-compose pid to a temp file, and write the pid to the target harbor.pid file after Harbor Service is started successfully. @reasonerjt @steven-zou

jessehu commented 6 years ago

This issue is resolved by these patches. We're running CI to verify this fix. Use status_check to check Harbor status in harbor job start https://git.eng.vmware.com/harbor/habo/commit/d7d818ee93daf7cd8d2e7da2fb678247bcab8c3b Let monit detect the real harbor pid until Harbor Service is running https://git.eng.vmware.com/harbor/habo/commit/6faba57f524ed75efc27a39def2f42b8c33b218c

jessehu commented 6 years ago

Verified in CI.