apache / pulsar

Apache Pulsar - distributed pub-sub messaging system
https://pulsar.apache.org/
Apache License 2.0
14.28k stars 3.59k forks source link

Flaky-test: ClusterMigrationTest.testClusterMigrationWithReplicationBacklog #20375

Open lhotari opened 1 year ago

lhotari commented 1 year ago

Search before asking

Example failure

https://github.com/apache/pulsar/actions/runs/5056774399/jobs/9075132532?pr=20372#step:10:1379

Exception stacktrace

Error:  Tests run: 19, Failures: 1, Errors: 0, Skipped: 3, Time elapsed: 1,045.505 s <<< FAILURE! - in org.apache.pulsar.broker.service.ClusterMigrationTest
  Error:  org.apache.pulsar.broker.service.ClusterMigrationTest.testClusterMigrationWithReplicationBacklog[false, Key_Shared](18)  Time elapsed: 55.544 s  <<< FAILURE!
  java.lang.AssertionError: expected [2] but found [1]
    at org.testng.Assert.fail(Assert.java:110)
    at org.testng.Assert.failNotEquals(Assert.java:1577)
    at org.testng.Assert.assertEqualsImpl(Assert.java:149)
    at org.testng.Assert.assertEquals(Assert.java:131)
    at org.testng.Assert.assertEquals(Assert.java:1418)
    at org.testng.Assert.assertEquals(Assert.java:[1382](https://github.com/apache/pulsar/actions/runs/5056774399/jobs/9075132532?pr=20372#step:10:1383))
    at org.testng.Assert.assertEquals(Assert.java:1428)
    at org.apache.pulsar.broker.service.ClusterMigrationTest.testClusterMigrationWithReplicationBacklog(ClusterMigrationTest.java:463)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139)
    at org.testng.internal.invokers.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:47)
    at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:76)
    at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)

Are you willing to submit a PR?

vineeth1995 commented 1 year ago

Fix for this issue - https://github.com/apache/pulsar/pull/20379

lhotari commented 1 year ago

another recent failure, contains #20379 changes:

  Error:  Tests run: 25, Failures: 1, Errors: 0, Skipped: 11, Time elapsed: 928.429 s <<< FAILURE! - in org.apache.pulsar.broker.service.ClusterMigrationTest
  Error:  org.apache.pulsar.broker.service.ClusterMigrationTest.testClusterMigrationWithReplicationBacklog[false, Key_Shared](14)  Time elapsed: 92.04 s  <<< FAILURE!
  org.apache.pulsar.client.api.PulsarClientException$TimeoutException: The producer cluster1-1 can not send message to the topic persistent://pulsar/migrationNs/migrationTopic-086a8557-33f1-4e52-a04c-1cd285a00753 within given timeout : createdAt 30.002 seconds ago, firstSentAt 0.0 seconds ago, lastSentAt 0.0 seconds ago, retryCount 0
    at org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:1043)
    at org.apache.pulsar.client.impl.TypedMessageBuilderImpl.send(TypedMessageBuilderImpl.java:90)
    at org.apache.pulsar.client.impl.ProducerBase.send(ProducerBase.java:62)
    at org.apache.pulsar.broker.service.ClusterMigrationTest.testClusterMigrationWithReplicationBacklog(ClusterMigrationTest.java:461)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:568)
    at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139)
    at org.testng.internal.invokers.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:47)
    at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:76)
    at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
    at java.base/java.lang.Thread.run(Thread.java:833)

the detailed surefire logs are at https://gist.github.com/lhotari/e0cef84147402309bb25d81c06522500

@vineeth1995 do you have a chance to take another look at this flaky test? thanks

vineeth1995 commented 1 year ago

Hi,

I will fix it. Thanks.

Regards, Vineeth

On Wed, Jun 7, 2023 at 9:43 AM Lari Hotari @.***> wrote:

another recent failure, contains #20379 https://github.com/apache/pulsar/pull/20379 changes:

Error: Tests run: 25, Failures: 1, Errors: 0, Skipped: 11, Time elapsed: 928.429 s <<< FAILURE! - in org.apache.pulsar.broker.service.ClusterMigrationTest Error: org.apache.pulsar.broker.service.ClusterMigrationTest.testClusterMigrationWithReplicationBacklogfalse, Key_Shared Time elapsed: 92.04 s <<< FAILURE! org.apache.pulsar.client.api.PulsarClientException$TimeoutException: The producer cluster1-1 can not send message to the topic persistent://pulsar/migrationNs/migrationTopic-086a8557-33f1-4e52-a04c-1cd285a00753 within given timeout : createdAt 30.002 seconds ago, firstSentAt 0.0 seconds ago, lastSentAt 0.0 seconds ago, retryCount 0 at org.apache.pulsar.client.api.PulsarClientException.unwrap(PulsarClientException.java:1043) at org.apache.pulsar.client.impl.TypedMessageBuilderImpl.send(TypedMessageBuilderImpl.java:90) at org.apache.pulsar.client.impl.ProducerBase.send(ProducerBase.java:62) at org.apache.pulsar.broker.service.ClusterMigrationTest.testClusterMigrationWithReplicationBacklog(ClusterMigrationTest.java:461) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:568) at org.testng.internal.invokers.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:139) at org.testng.internal.invokers.InvokeMethodRunnable.runOne(InvokeMethodRunnable.java:47) at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:76) at org.testng.internal.invokers.InvokeMethodRunnable.call(InvokeMethodRunnable.java:11) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833)

the detailed surefire logs are at https://gist.github.com/lhotari/e0cef84147402309bb25d81c06522500

@vineeth1995 https://github.com/vineeth1995 do you have a chance to take another look at this flaky test? thanks

— Reply to this email directly, view it on GitHub https://github.com/apache/pulsar/issues/20375#issuecomment-1581182896, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGAMY7ZD7PP6AWPSBINXBT3XKCVSVANCNFSM6AAAAAAYMAC7TQ . You are receiving this because you were mentioned.Message ID: @.***>

vineeth1995 commented 1 year ago

Pr for the fix - https://github.com/apache/pulsar/pull/20546/files

github-actions[bot] commented 1 year ago

The issue had no activity for 30 days, mark with Stale label.