streamnative / pulsar-flink

Elastic data processing with Apache Pulsar and Apache Flink
Apache License 2.0
278 stars 119 forks source link

[QUESTION] Error after running for few hours #117

Closed rec7y33 closed 2 years ago

rec7y33 commented 4 years ago

Hello, I am facing a problem on pulsar flink where this NoClassDefFoundError: org/apache/pulsar/shade/com/yahoo/sketches/quantiles/DoublesAuxiliary exception will throw after it had run for some time. The job will run again if I restarted it, but it will randomly stop after few hours again, with the same exception. Any idea what caused it?

Thanks!

2020-08-11 23:10:27,150 WARN  org.apache.pulsar.shade.io.netty.util.HashedWheelTimer        - An exception was thrown by TimerTask.
java.lang.NoClassDefFoundError: org/apache/pulsar/shade/com/yahoo/sketches/quantiles/DoublesAuxiliary
    at org.apache.pulsar.shade.com.yahoo.sketches.quantiles.DoublesSketch.constructAuxiliary(DoublesSketch.java:607)
    at org.apache.pulsar.shade.com.yahoo.sketches.quantiles.DoublesSketch.getQuantiles(DoublesSketch.java:233)
    at org.apache.pulsar.client.impl.ProducerStatsRecorderImpl.lambda$init$0(ProducerStatsRecorderImpl.java:129)
    at org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:672)
    at org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:747)
    at org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:472)
    at org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(Thread.java:748)
2020-08-11 23:10:27,150 INFO  org.apache.pulsar.client.impl.ConnectionHandler               - [persistent://public/flink_u/occupancy-data-branch-19467] [pulsar-cluster-1-360-22497] Reconnecting after timeout
2020-08-11 23:10:27,150 WARN  org.apache.pulsar.client.impl.ConnectionHandler               - [persistent://public/flink_u/occupancy-data-branch-18499] [pulsar-cluster-1-362-42067] Error connecting to broker: org.apache.pulsar.client.api.PulsarClientException: Connection already closed
2020-08-11 23:10:27,150 INFO  org.apache.pulsar.client.impl.ConnectionHandler               - [persistent://public/flink_u/occupancy-data-branch-20293] [pulsar-cluster-1-314-36923] Reconnecting after connection was closed
2020-08-11 23:10:27,150 INFO  org.apache.pulsar.client.impl.ConnectionHandler               - [persistent://public/flink_u/occupancy-data-branch-18464] [pulsar-cluster-1-360-22500] Reconnecting after timeout
2020-08-11 23:10:27,150 WARN  org.apache.pulsar.client.impl.ConnectionHandler               - [persistent://public/flink_u/occupancy-data-branch-20293] [pulsar-cluster-1-314-36923] Exception thrown while getting connection: 
java.lang.NoClassDefFoundError: org/apache/pulsar/shade/com/google/common/cache/RemovalCause
    at org.apache.pulsar.shade.com.google.common.cache.LocalCache$Segment.expireEntries(LocalCache.java:2593)
    at org.apache.pulsar.shade.com.google.common.cache.LocalCache$Segment.tryExpireEntries(LocalCache.java:2574)
    at org.apache.pulsar.shade.com.google.common.cache.LocalCache$Segment.getLiveValue(LocalCache.java:2714)
    at org.apache.pulsar.shade.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2030)
    at org.apache.pulsar.shade.com.google.common.cache.LocalCache.get(LocalCache.java:3951)
    at org.apache.pulsar.shade.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3973)
    at org.apache.pulsar.shade.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4957)
    at org.apache.pulsar.common.naming.TopicName.get(TopicName.java:88)
    at org.apache.pulsar.client.impl.PulsarClientImpl.getConnection(PulsarClientImpl.java:606)
    at org.apache.pulsar.client.impl.ConnectionHandler.grabCnx(ConnectionHandler.java:67)
    at org.apache.pulsar.client.impl.ConnectionHandler.lambda$reconnectLater$1(ConnectionHandler.java:101)
    at org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:672)
    at org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:747)
    at org.apache.pulsar.shade.io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:472)
    at org.apache.pulsar.shade.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.lang.Thread.run(Thread.java:748)
jianyun8023 commented 4 years ago

Please provide the following version information.

rec7y33 commented 4 years ago

do you mean the pulsar-flink connector version?

        <dependency>
            <groupId>io.streamnative.connectors</groupId>
            <artifactId>pulsar-flink-connector_2.11</artifactId>
            <version>2.4.29-SNAPSHOT</version>
        </dependency>
jianyun8023 commented 4 years ago

Well, what is the version of pulsar and flink you are using?

rec7y33 commented 4 years ago

Pulsar server 2.5.2 flink server 1.10.1

then the dependencies in jar file:

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java_2.12</artifactId>
            <version>1.10.1</version>
            <type>jar</type>
        </dependency>

        <dependency>
            <groupId>org.apache.pulsar</groupId>
            <artifactId>pulsar-client</artifactId>
            <version>2.5.2</version>
            <type>jar</type>
        </dependency>
syhily commented 2 years ago

@rec7y33 This issue should be solved after upgrading to the latest connector. Can you confirm it?