ITISFoundation / osparc-simcore

🐼 osparc-simcore simulation framework
https://osparc.io
MIT License
46 stars 27 forks source link

🐛 Dy-Sidecar incident: Dy-Sidecar orphaned after long start-up time #3193

Closed mrnicegyu11 closed 2 years ago

mrnicegyu11 commented 2 years ago

f82ad5e3-f680-4e84-81b1: Oprhaned state: Sidecar running, idling, no attached dy-service-container What happened? Director-v2 fails service observation of this sidecar, with:

Approx. 23 minutes after this log-message, the dynamic sidecar starts, but stays idling forever. No caddy is found.

Actionable follow-ups:

@GitHK @pcrespov @sanderegg

dy-sidecar_f82ad5e3.txt oprhaned_dy_sidecar_aws_prod_1.csv

pcrespov commented 2 years ago

@GitHK please review these and organize a fix as agreed

GitHK commented 2 years ago

The behaviour is correct here.

Director-v2 could not determine a node where the dy-sidecar was started, because no containers was created in 60 seconds (DYNAMIC_SIDECAR_TIMEOUT_FETCH_DYNAMIC_SIDECAR_NODE_ID) . It is ok to fail.

The following PR https://github.com/ITISFoundation/osparc-simcore/pull/3272 has addressed the above issue. Services will get cleaned up after error and will no longer linger.