goharbor / harbor

An open source trusted cloud native registry project that stores, signs, and scans content.
https://goharbor.io
Apache License 2.0
24.31k stars 4.77k forks source link

Garbage collection job is constantly in pending status #20624

Closed petr202261 closed 5 months ago

petr202261 commented 5 months ago

When I check my harbor Job Service Dashboard I see that a Garbage collection job is constantly in pending status. If I stop the job then in several hours I see the new job in pending status again. I have a very big database. Quota used - 10TiB. I see such errors related to the Garbage collector:

[GARBAGE_COLLECTION] failed to batch clean candidates, error: failed to delete executions: ERROR: update or delete on table "execution" violates foreign key constraint "task_execution_id_fkey" on table "task" (SQLSTATE 23503)

[GARBAGE_COLLECTION] failed to run sweep, error: failed to delete executions: ERROR: update or delete on table "execution" violates foreign key constraint "task_execution_id_fkey" on table "task" (SQLSTATE 23503)

These errors are repeated each hour in the logs. The schedule of the garbage collector is "0 0 4 *" The harbor is running in EKS and a database is AWS RDS

also with API requests I see such errors:

{ "creation_time": "2024-06-17T01:30:00.726Z", "id": 13477493, "job_kind": "SCHEDULE", "job_name": "GARBAGE_COLLECTION", "job_parameters": "{\"delete_untagged\":false,\"dry_run\":false,\"redis_url_reg\":\"redis://harbor.ck4irk.ng.0001.use2.cache.amazonaws.com:6379/2?idle_timeout_seconds=30\",\"time_window\":2,\"workers\":5}", "job_status": "Error", "schedule": { "next_scheduled_time": "0001-01-01T00:00:00.000Z", "type": "Schedule" }, "update_time": "2024-06-18T02:05:09.000Z" }, { "creation_time": "2024-06-14T01:30:00.931Z", "id": 13410933, "job_kind": "SCHEDULE", "job_name": "GARBAGE_COLLECTION", "job_parameters": "{\"delete_untagged\":false,\"dry_run\":false,\"freed_space\":2667567179994,\"purged_blobs\":46616,\"purged_manifests\":12189,\"redis_url_reg\":\"redis://harbor.ck4irk.ng.0001.use2.cache.amazonaws.com:6379/2?idle_timeout_seconds=30\",\"time_window\":2,\"workers\":5}", "job_status": "Error", "schedule": { "next_scheduled_time": "0001-01-01T00:00:00.000Z", "type": "Schedule" }, "update_time": "2024-06-18T11:44:12.846Z" },

Please, help me to solve this issue

MinerYang commented 5 months ago

It is an known issue that fixed by this PR https://github.com/goharbor/harbor/pull/20603 cc @chlins And here's a workaround https://github.com/goharbor/harbor/issues/19494#issuecomment-1803750308

petr202261 commented 5 months ago

thank you, very much