fsprojects / pulsar-client-dotnet

Apache Pulsar native client for .NET (C#/F#/VB)
MIT License
303 stars 48 forks source link

Reconnection loop after server PersistenceError #284

Open adrianotr opened 15 hours ago

adrianotr commented 15 hours ago

Hello there! I think I might have found an issue.

The client connection seems to enter a loop trying to reconnect after we receive a PersistenceError from the server, as seen in the first line of the log.

Logs:

"clientCnx(1, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" Received send error from server: PersistenceError : "Cannot determine whether the message is a duplicate at this time"
"clientCnx(1, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" Socket was disconnected exceptionally on writing Send
Connection backgroundTask "LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650" removed
"clientCnx(1, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" requestsMb mailbox has stopped normally
"clientCnx(1, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" operationsMb mailbox has stopped normally
"clientCnx(1, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" sendMb mailbox has stopped normally
"producer(0, , -1) ConnectionHandler" Closed connection to "persistent://stg-auction/auction-pvt/phase-action-submitted-v1" Current state "Ready   clientCnx(1, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" -- Will try again in 100ms
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
Connecting to "Broker (LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650, PhysicalAddress Unspecified/pulsar-proxy.apps-01.npa.us-east-1.kar-services.net:6651)" with maxMessageSize: 5242880
Connected to "Broker (LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650, PhysicalAddress Unspecified/pulsar-proxy.apps-01.npa.us-east-1.kar-services.net:6651)"
"clientCnx(3, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" Connected ProtocolVersion: 19 ServerVersion: "Pulsar Server3.0.5" MaxMessageSize: 5242880
"producer(0, nonprod-use1-191-16442, -1)" starting register to topic "persistent://stg-auction/auction-pvt/phase-action-submitted-v1"
"producer(0, nonprod-use1-191-16442, -1)" registered
"producer(0, nonprod-use1-191-16442, -1)" resending 8 pending messages
"clientCnx(3, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" Socket was disconnected exceptionally on writing Send
Connection backgroundTask "LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650" removed
"clientCnx(3, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" operationsMb mailbox has stopped normally
"clientCnx(3, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" sendMb mailbox has stopped normally
"producer(0, , -1) ConnectionHandler" Closed connection to "persistent://stg-auction/auction-pvt/phase-action-submitted-v1" Current state "Ready   clientCnx(3, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" -- Will try again in 100ms
"clientCnx(3, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" requestsMb mailbox has stopped normally
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
Connecting to "Broker (LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650, PhysicalAddress Unspecified/pulsar-proxy.apps-01.npa.us-east-1.kar-services.net:6651)" with maxMessageSize: 5242880
Connected to "Broker (LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650, PhysicalAddress Unspecified/pulsar-proxy.apps-01.npa.us-east-1.kar-services.net:6651)"
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
"clientCnx(4, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" Connected ProtocolVersion: 19 ServerVersion: "Pulsar Server3.0.5" MaxMessageSize: 5242880
"producer(0, nonprod-use1-191-16442, -1)" starting register to topic "persistent://stg-auction/auction-pvt/phase-action-submitted-v1"
"producer(0, nonprod-use1-191-16442, -1)" registered
"producer(0, nonprod-use1-191-16442, -1)" resending 12 pending messages
"clientCnx(4, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" Socket was disconnected exceptionally on writing Send
Connection backgroundTask "LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650" removed
"producer(0, , -1) ConnectionHandler" Closed connection to "persistent://stg-auction/auction-pvt/phase-action-submitted-v1" Current state "Ready   clientCnx(4, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" -- Will try again in 100ms
"clientCnx(4, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" requestsMb mailbox has stopped normally
"clientCnx(4, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" sendMb mailbox has stopped normally
"clientCnx(4, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" operationsMb mailbox has stopped normally
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
Connecting to "Broker (LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650, PhysicalAddress Unspecified/pulsar-proxy.apps-01.npa.us-east-1.kar-services.net:6651)" with maxMessageSize: 5242880
"producer(0, nonprod-use1-191-16442, -1)" not connected, skipping send
Connected to "Broker (LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650, PhysicalAddress Unspecified/pulsar-proxy.apps-01.npa.us-east-1.kar-services.net:6651)"
"clientCnx(5, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" Connected ProtocolVersion: 19 ServerVersion: "Pulsar Server3.0.5" MaxMessageSize: 5242880
"producer(0, nonprod-use1-191-16442, -1)" starting register to topic "persistent://stg-auction/auction-pvt/phase-action-submitted-v1"
"producer(0, nonprod-use1-191-16442, -1)" registered
"producer(0, nonprod-use1-191-16442, -1)" resending 14 pending messages
"clientCnx(5, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" Socket was disconnected exceptionally on writing Send
Connection backgroundTask "LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650" removed
"producer(0, , -1) ConnectionHandler" Closed connection to "persistent://stg-auction/auction-pvt/phase-action-submitted-v1" Current state "Ready   clientCnx(5, LogicalAddres Unspecified/pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local:6650)" -- Will try again in 100ms

and the reconnection attempts continue.

Eventually the producer times out.

Lanayx commented 14 hours ago

@adrianotr It seems like the exception is missing, so it's not clear what goes wrong, can you please check why exception is not logged? Here is the logging line

This is the recommended logging setup (change Warning level to Debug):

https://github.com/fsprojects/pulsar-client-dotnet/blob/18c68d4aad5d810e2abfa4b706bb555bd6df6d62/tests/IntegrationTests/Common.fs#L36-L50