kruize / autotune

Autonomous Performance Tuning for Kubernetes!
Apache License 2.0
165 stars 54 forks source link

Parallel request to bulk service failed with a few jobs in IN_PROGRESS STATE #1374

Open chandrams opened 1 week ago

chandrams commented 1 week ago

Describe the bug Observed the below issues with parallel requests to bulk service:

How to reproduce it

Expected behavior

Relevant logs

{"status":"IN_PROGRESS","total_experiments":1,"processed_experiments":1,"message":null,"job_id":"f918009a-2e3d-486a-ad22-aa9204ae8c0b","job_start_time":"2024-11-13T08:19:17.990Z","job_end_time":null}

Unprocessed experiments

{
    "status": "COMPLETED",
    "total_experiments": 1,
    "processed_experiments": 1,
    "notifications": null,
    "experiments": {
        "thanos|default|msc-0|tfb-qrh-sample-0(deployment)|tfb-0": {
            "recommendations": {
                "status": "UNPROCESSED",
                "notifications": null
            }
        }
    },

Environment:

Additional context Add any other context about the problem here, any links or screenshots