line / centraldogma

Highly-available version-controlled service configuration repository based on Git, ZooKeeper and HTTP/2
https://line.github.io/centraldogma/
Apache License 2.0
590 stars 117 forks source link

Unexpected exception while mirroring occurrs after upgrade to 0.66.0 #968

Open earlbread opened 3 weeks ago

earlbread commented 3 weeks ago

Hi team. After upgrading to 0.66.0, I am getting the following error message. This issue seems to be preventing mirroring from working properly. Do you have any idea what is causing the issue and is there anything I can try to fix it?

Thank you.

2024-07-04 08:31:39.543 [WARN ](c.l.c.s.i.m.DefaultMirroringService) [mirroring-worker-11-663] Unexpected exception while mirroring: {schedule=every minute, direction=REMOTE_TO_LOCAL, localProj=xxx, localRepo=xxx, localPath=/, remoteRepo=git+ssh://xxx.git, remotePath=/xxx/ remoteBranch=main, credential=PublicKeyMirrorCredential{id=xxx, hostnamePatterns=[github.com$], username=git, publicKey=ecdsa-sha2-...}}
com.linecorp.centraldogma.server.MirrorException: org.eclipse.jgit.api.errors.TransportException: Expected ACK/NAK, got: shallow 7bbd530a19784634d0dced0acef3fcecf33f5bd5
        at com.linecorp.centraldogma.server.internal.mirror.AbstractMirror.mirror(AbstractMirror.java:171)
        at com.linecorp.centraldogma.server.internal.mirror.MirroringTask.lambda$run$0(MirroringTask.java:64)
        at io.micrometer.core.instrument.composite.CompositeTimer.record(CompositeTimer.java:141)
        at com.linecorp.centraldogma.server.internal.mirror.MirroringTask.run(MirroringTask.java:64)
        at com.linecorp.centraldogma.server.internal.mirror.DefaultMirroringService.run(DefaultMirroringService.java:243)
        at com.linecorp.centraldogma.server.internal.mirror.DefaultMirroringService.lambda$run$6(DefaultMirroringService.java:227)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
        at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
        at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.eclipse.jgit.api.errors.TransportException: Expected ACK/NAK, got: shallow 7bbd530a19784634d0dced0acef3fcecf33f5bd5
        at org.eclipse.jgit.api.FetchCommand.call(FetchCommand.java:249)
        at com.linecorp.centraldogma.server.internal.mirror.AbstractGitMirror.fetchRemoteHeadAndGetCommitId(AbstractGitMirror.java:463)
        at com.linecorp.centraldogma.server.internal.mirror.AbstractGitMirror.mirrorRemoteToLocal(AbstractGitMirror.java:247)
        at com.linecorp.centraldogma.server.internal.mirror.SshGitMirror.mirrorRemoteToLocal(SshGitMirror.java:112)
        at com.linecorp.centraldogma.server.internal.mirror.AbstractMirror.mirror(AbstractMirror.java:162)
        ... 13 common frames omitted
Caused by: org.eclipse.jgit.errors.TransportException: Expected ACK/NAK, got: shallow 7bbd530a19784634d0dced0acef3fcecf33f5bd5
        at org.eclipse.jgit.transport.BasePackFetchConnection.doFetch(BasePackFetchConnection.java:458)
        at org.eclipse.jgit.transport.BasePackFetchConnection.fetch(BasePackFetchConnection.java:351)
        at org.eclipse.jgit.transport.BasePackFetchConnection.fetch(BasePackFetchConnection.java:343)
        at org.eclipse.jgit.transport.FetchProcess.fetchObjects(FetchProcess.java:290)
        at org.eclipse.jgit.transport.FetchProcess.executeImp(FetchProcess.java:182)
        at org.eclipse.jgit.transport.FetchProcess.execute(FetchProcess.java:105)
        at org.eclipse.jgit.transport.Transport.fetch(Transport.java:1482)
        at org.eclipse.jgit.api.FetchCommand.call(FetchCommand.java:238)
        ... 17 common frames omitted
Caused by: org.eclipse.jgit.errors.PackProtocolException: Expected ACK/NAK, got: shallow 7bbd530a19784634d0dced0acef3fcecf33f5bd5
        at org.eclipse.jgit.transport.PacketLineIn.readACK(PacketLineIn.java:163)
        at org.eclipse.jgit.transport.BasePackFetchConnection.negotiate(BasePackFetchConnection.java:945)
        at org.eclipse.jgit.transport.BasePackFetchConnection.doFetch(BasePackFetchConnection.java:447)
        ... 24 common frames omitted
ikhoon commented 3 weeks ago

Which version did you try to upgrade from?

earlbread commented 3 weeks ago

Which version did you try to upgrade from?

The previous version is 0.58.1 @ikhoon

henry-ahn0 commented 3 weeks ago

I'm looking into it now, but I suspect there may be a bug in jGit itself.

However, I am not sure of the clear difference between success and failure in some cases.

minwoox commented 3 weeks ago

I have no idea at the moment. @earlbread Does this happen always even if you restart the replicas?

henry-ahn0 commented 3 weeks ago

@minwoox We are running central dogma in two different environments, alpha and prod, and the same thing is happening in both alpha and prod. 🤔

First, let's try restarting in alpha environment.

ikhoon commented 3 weeks ago

Could you create a new mirror config? I want to know if it doesn't work even when a new git repository is created, or if the problem only occurs when upgrading.

earlbread commented 3 weeks ago

Not every git repo has a mirror problem. We are experiencing this issue for a specific git repo.

earlbread commented 3 weeks ago

After restarting, the repos are still experiencing the same issue. @minwoox

henry-ahn0 commented 3 weeks ago

as far as I know,

https://github.com/line/centraldogma/pull/808

I know that the shallow feature was added in this commit. (FetchCommand). The error did not occur in previous versions before shallow was added.

ikhoon commented 3 weeks ago

If #808 was the cause, you can delete _mirrors directory as a workaround. The folder will be created automatically if absent.

henry-ahn0 commented 3 weeks ago

@ikhoon

ok, let’s delete the _mirrors directory and restart.

ikhoon commented 3 weeks ago

In addition to the workaround, we will handle this issue separately.

earlbread commented 3 weeks ago

The errors are disappeared after deleting _mirrors. Thank you! @ikhoon

ikhoon commented 3 weeks ago

Glad to hear. /metadata.json is located in meta folder which should not be affected by _mirrors removal.

henry-ahn0 commented 3 weeks ago

It seems that there was a conflict with garbage data in the _mirrors directory in past versions of central dogma. It works normally now! Thank your support!

earlbread commented 3 weeks ago

Glad to hear. /metadata.json is located in meta folder which should not be affected by _mirrors removal.

It was my mistake. I copied _mirrors after creating the /data/tmp directory, and I think that's what caused the problem. I deleted /data/tmp and backed up the data to /data/_tmp and the problem did not occur. Thank you!