[X] I searched in the issues and found nothing similar.
Version
pulsar-2.10, 2.11
Minimal reproduce step
Sometime Broker tries to recover the topic cursor and couldn't create subscription successfully which causes consumer creation stuck at broker and client application times out and fails to subscribe on topic.
Broker log
05:50:28.411 [pulsar-io-4-32] INFO org.apache.pulsar.broker.service.ServerCnx - [/1.1.1.1:38126] Subscribing on topic persistent://pulsar/use1/myns/topic-1 / my-sub
05:50:28.411 [pulsar-io-4-32] INFO org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [pulsar/use1/myns/persistent/topic-1] Cursor my-sub recovered to position 52505149:179
05:50:28.412 [pulsar-io-4-32] INFO org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [pulsar/use1/myns/persistent/topic-1] Consumer my-sub cursor initialized with counters: consumed 0 mdPos 52505149:179 rdPos 52505149:180
05:50:28.412 [pulsar-io-4-32] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [pulsar/use1/myns/persistent/topic-1] Creating ledger, metadata: {component=[109, 97, 110, 97, 103, 101, 100, 45, 108, 101, 100, 103, 101, 114], pulsar/managed-ledger=[112, 117, 108, 115, 97, 114, 47, 112, 115, 116, 103, 45, 103, 113, 49, 47, 112, 115, 116, 103, 49, 45, 98, 114, 111, 107, 101, 114, 50, 46, 109, 101, 115, 115, 97, 103, 105, 110, 103, 46, 103, 113, 49, 46, 121, 97, 104, 111, 111, 46, 99, 111, 109, 58, 52, 48, 56, 48, 47, 112, 101, 114, 115, 105, 115, 116, 101, 110, 116, 47, 115, 97, 110, 105, 116, 121, 84, 101, 115, 116, 84, 111, 112, 105, 99, 45, 49], pulsar/cursor=[115, 97, 110, 105, 116, 121, 83, 117, 98], application=[112, 117, 108, 115, 97, 114]} - metadata ops timeout : 60 seconds
05:50:28.421 [main-EventThread] INFO org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [pulsar/use1/myns/persistent/topic-1] Created ledger 52506173 for cursor my-sub
05:50:29.860 [bookkeeper-ml-scheduler-OrderedScheduler-5-0] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [pulsar/use1/myns/persistent/topic-1] Ledger already created when timeout task is triggered
05:51:28.412 [bookkeeper-ml-scheduler-OrderedScheduler-2-0] INFO org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [pulsar/use1/myns/persistent/topic-1] Ledger already created when timeout task is triggered
05:51:40.443 [pulsar-io-4-2] INFO org.apache.pulsar.broker.service.ServerCnx - [/1.1.1.1:38174] Subscribing on topic persistent://pulsar/use1/myns/topic-1 / my-sub
05:53:47.732 [pulsar-io-4-5] INFO org.apache.pulsar.broker.service.ServerCnx - [/1.1.1.1:38206] Subscribing on topic persistent://pulsar/use1/myns/topic-1 / my-sub
05:55:55.171 [pulsar-io-4-8] INFO org.apache.pulsar.broker.service.ServerCnx - [/1.1.1.1:38238] Subscribing on topic persistent://pulsar/use1/myns/topic-1 / my-sub
05:58:02.655 [pulsar-io-4-11] INFO org.apache.pulsar.broker.service.ServerCnx - [/1.1.1.1:38316] Subscribing on topic persistent://pulsar/use1/myns/topic-1 / my-sub
06:00:10.991 [pulsar-io-4-13] INFO org.apache.pulsar.broker.service.ServerCnx - [/1.1.1.1:38348] Subscribing on topic persistent://pulsar/use1/myns/topic-1 / my-sub
client logs while creating subscription
2023-05-19 05:51:40 [pulsar-client-io-8-1] INFO o.a.pulsar.client.impl.ConsumerImpl - [persistent://pulsar/use1/myns/topic-1][my-sub] Subscribing to topic on cnx
[id: 0xdcb4fbde, L:/1.1.1.1:38174 - R:broker2.use1.pulsar.com/1.1.1.1:6651], consumerId 0
2023-05-19 05:52:10 [pulsar-client-io-8-1] DEBUG o.a.p.common.protocol.PulsarDecoder - [broker2.use1.pulsar.com/1.1.1.1:6651] Received cmd PING
2023-05-19 05:52:10 [pulsar-client-io-8-1] DEBUG o.a.p.common.protocol.PulsarHandler - [[id: 0xdcb4fbde, L:/1.1.1.1:38174 - R:broker2.use1.pulsar.com/1.1.1.1:6651]] Replying back to ping mess
age
2023-05-19 05:52:10 [pulsar-client-io-8-1] DEBUG o.a.p.common.protocol.PulsarHandler - [[id: 0xdcb4fbde, L:/1.1.1.1:38174 - R:broker2.use1.pulsar.com/1.1.1.1:6651]] Sending ping message
2023-05-19 05:52:10 [pulsar-client-io-8-1] DEBUG o.a.p.common.protocol.PulsarDecoder - [broker2.use1.pulsar.com/1.1.1.1:6651] Received cmd PONG
2023-05-19 05:52:40 [pulsar-client-io-8-1] DEBUG o.a.p.common.protocol.PulsarDecoder - [broker2.use1.pulsar.com/1.1.1.1:6651] Received cmd PING
2023-05-19 05:52:40 [pulsar-client-io-8-1] DEBUG o.a.p.common.protocol.PulsarHandler - [[id: 0xdcb4fbde, L:/1.1.1.1:38174 - R:broker2.use1.pulsar.com/1.1.1.1:6651]] Replying back to ping mess
age
2023-05-19 05:52:40 [pulsar-client-io-8-1] DEBUG o.a.p.common.protocol.PulsarHandler - [[id: 0xdcb4fbde, L:/1.1.1.1:38174 - R:broker2.use1.pulsar.com/1.1.1.1:6651]] Sending ping message
2023-05-19 05:52:40 [pulsar-client-io-8-1] WARN o.a.pulsar.client.impl.ConsumerImpl - [persistent://pulsar/use1/myns/topic-1][my-sub] Failed to subscribe to topi
c on broker2.use1.pulsar.com/1.1.1.1:6651
2023-05-19 05:52:40 [pulsar-client-io-8-1] WARN o.a.pulsar.client.impl.ClientCnx - [id: 0xdcb4fbde, L:/1.1.1.1:38174 - R:broker2.use1.pulsar.com/1.1.1.1:6651] request timeout {'durationMs':
'30000', 'reqId':'3312998569082859124', 'remote':'broker2.use1.pulsar.com/1.1.1.1:6651', 'local':'/1.1.1.1:38174'}
What did you expect to see?
sometime broker stucks to create subscription and it can be only fixed by unloading the topic
What did you see instead?
broker should be resilient to handle failure and should make sure to recover curosr and create subscription successfully without impacting availability of the topic.
Search before asking
Version
Minimal reproduce step
Sometime Broker tries to recover the topic cursor and couldn't create subscription successfully which causes consumer creation stuck at broker and client application times out and fails to subscribe on topic.
Broker log
client logs while creating subscription
What did you expect to see?
sometime broker stucks to create subscription and it can be only fixed by unloading the topic
What did you see instead?
broker should be resilient to handle failure and should make sure to recover curosr and create subscription successfully without impacting availability of the topic.
Anything else?
No response
Are you willing to submit a PR?