kube-HPC / hkube

🐟 High Performance Computing over Kubernetes - Core Repo 🎣
http://hkube.io
MIT License
306 stars 20 forks source link

algorithm <ALGO-NAME> has disconnected while in ready state, reason: 1000. #1903

Closed ism55ism55 closed 3 months ago

ism55ism55 commented 6 months ago

HKube micro-service HKube 2.6.20 Worker pod

Describe the bug During a 10 min run of sanity test i received multiple reports of the following:

Error: algorithm has disconnected while in ready state, reason: 1000.

Expected behavior

To Reproduce Steps to reproduce the behavior:

  1. Run "simple" sanity automated test.

Screenshots

golanha commented 3 months ago

This warning would apear if the algorithm container suddely disconnected, with out sending a failure indication to worker.