goharbor / harbor

An open source trusted cloud native registry project that stores, signs, and scans content.
https://goharbor.io
Apache License 2.0
23.75k stars 4.73k forks source link

Bi-directional push replication causes a replication loop #17754

Closed rci-kmccolm closed 1 year ago

rci-kmccolm commented 1 year ago

If you are reporting a problem, please make sure the following information are provided:

Expected behavior and actual behavior: We have two Harbor instances in different sites and we want to keep them in sync with each other and use Anycast or GSLB type load balancing/failover. When creating a push based replication rule between the two harbor sites, any change in one site gets replicated to the remote site, however the remote site then re-replicates the change back to the original site, and on and on. This creates a continual loop of replication traffic that drives unneccesary CPU, disk and network utilization on the Harbor servers.

We expect that when a replication event is received from a remote Harbor instance that it would not then immediately be pushed to the Harbor that sent the update even if a replication rule exists to do so.

Perhaps this is not the intended architecture for Harbor replication, please guide us if so.

Steps to reproduce the problem:

  1. Create two harbor instances
  2. Create a registry endpoint on each Harbor instance, pointing to the other
  3. Create a push based replication rule on each Harbor to push updates to the other Harbor

Versions: Please specify the versions of following systems.

Additional context:

zyyw commented 1 year ago

Please double check this on Harbor 2.6.1 @YangJiao0817 . Closing it if not reproduce-able.

rci-kmccolm commented 1 year ago

Issue persists on Harbor v2.7.1-6015b3ef

rci-kmccolm commented 1 year ago

Some more details, this seems related to helm charts replication. When a helm chart is uploaded via the UI, it will create thousands of replication tasks to all destinations for the same helm chart. All of them have similar message as the below in the log and show status of 'InProgress' in the UI:

'{"code":10010,"message":"object is not found","details":"c17e2256bf16f94358231b05"}'
rci-kmccolm commented 1 year ago

The above comment may be a red-herring, it was referring to a job that was manually deleted from the database. From what I can see, when using event-based replication with the 'override' option set to true, if we upload a helm chart via the UI, the replication goes into an endless loop because both Harbor instances keep replicating back and forward to each other the same chart:

2023-04-21T14:07:43Z [WARNING] [/controller/replication/transfer/chart/transfer.go:153]: the same name chart <chart_name_redacted> exists on the destination registry and the "override" is set to true, continue...

The above log message is seen on both Harbor instances and an infinite number of replication jobs are created for the same chart on both Harbor instances towards the each other.

I believe the chart replication job should be updated to check if the incoming replication job was from another Harbor, to not replicated it back to that same Harbor.

A valid workaround is to disable the 'override' option on event-based replication where we have bi-directional replication. The downside to this is that if someone uploads a new artifact with the same name/tag, it will not be replicated to the remote Harbor.

github-actions[bot] commented 1 year ago

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

rci-kmccolm commented 1 year ago

Issue still relevent

YangJiao0817 commented 1 year ago

Chartmuseum has been deprecated in v2.8.0, you can migrate helm charts to oci artifacts to solve this problem, refer to https://github.com/goharbor/harbor/wiki/Migrate-helm-chart-to-oci-registry-in-harbor