kube-HPC / hkube

🐟 High Performance Computing over Kubernetes - Core Repo 🎣
http://hkube.io
MIT License
305 stars 20 forks source link

Algorithm queue fail to recover #1377

Closed tamir321 closed 2 years ago

tamir321 commented 3 years ago

HKube micro-service algorithm queue

Describe the bug Algorithm queue fails to recover state and pipeline got stock

To Reproduce


{
    "name": "alg-17-pipe",
    "nodes": [
        {
            "nodeName": "alg",
            "algorithmName": "alg-17",
            "input": [
                "#[0...3000]"
            ],
            "kind": "algorithm"
        }
    ],
    "options": {
        "batchTolerance": 100,
        "progressVerbosityLevel": "debug",
        "ttl": 3600
    },
    "kind": "batch",
    "experimentName": "main",
    "priority": 3
}

**Screenshots**
If applicable, add screenshots to help explain your problem.

**Additional context**
Add any other context about the problem here.
tamir321 commented 2 years ago

tested on systemVersion: "v2.1.88", fullSystemVersion: "v2.1.88-1631616917808",