goharbor / harbor

An open source trusted cloud native registry project that stores, signs, and scans content.
https://goharbor.io
Apache License 2.0
23.38k stars 4.69k forks source link

Harbor API showing inconsistent data for repositories, including pull/push issues #20725

Open geolney opened 1 month ago

geolney commented 1 month ago

We are deploying Harbor helm chart 1.12.1 on EKS 1.28 cluster, using postgres database (postgresql-ha - helmchart version 12.0.4) and S3 to store data and images while being served via harbor-nginx pod. For some time we have been experiencing many image pull errors, resulting in workloads taking a long time to provision until it gets a successful pull and can start a pod.

ERROR: Job failed: prepare environment: waiting for pod running: pulling image "<harbor-instance>/project/repo:v1.1.7": image pull failed: rpc error: code = NotFound desc = failed to pull and unpack image "<harbor-instance>/project/repo:v1.1.7": failed to resolve reference "<harbor-instance>/project/repo:v1.1.7": <harbor-instance>/project/repo:v1.1.7: not found.

When accessing the Harbor UI or calling API for the same project we see these issues which we believe are related. <harbor>/api/v2.0/projects/<repo>/repositories/fef/artifacts? shows different results at different times. One moment we'll see image tags: 1st time executing call: v1.1.1, v1.2.1, v1.2.2, v1.2.7 2nd time executing: v1.1.1, v1.2.7, v1.2.8, v1.3.1, v1.3.2, v1.4.0 - note that some previously available tags now not showing and extra ones showing.

These images are not via proxy, and are pushed into docker after being built via pipeline, some of which also fail when pushing due to unknown blob error, so maybe also related?

We are not entirely sure at which point this is failing, obviously the images and data are there in S3 and Postgres as eventually we can confirm that after several refreshes we're able to see all ~20 images even if not at the same point. So would like to know if there's any suggestion or if anyone else has also experienced this?

Happy to provide any info, logs, screenshots etc. Thanks in advance!

wy65701436 commented 1 month ago

It seems that the request was served by different databases in your side. Can you confirm the data consistence between your database instances?