kube-HPC / hkube

🐟 High Performance Computing over Kubernetes - Core Repo 🎣
http://hkube.io
MIT License
305 stars 20 forks source link

Hkube runs more than a single instance of a service (sporadic issue) #1651

Closed ism55ism55 closed 1 year ago

ism55ism55 commented 1 year ago

HKube micro-service Algo service

Describe the bug

  1. Upon running a new job (no other jobs running in background && system is @ idle) hkube will run 2-3 instances of a algo service (sporadic).
  2. Another scenario: make sure that a single instance of a specific service is running at job start --> cause the service to crash --> hkube will re-run the algo however it will run 2 (or even more) instances of this algo service although system is idle - sporadic

Expected behavior A single instance of this service will run upon job start / service restart

To Reproduce Steps to reproduce the behavior: read above 2 scenarios

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

golanha commented 1 year ago

Its a matter of configuration of hkube. We need to set a value to an environment variable of hkube's task-manager deployment: CREATED_JOBS_TTL with a large value, I checked , 45000 was sufisient.