datastax / pulsar-helm-chart

Apache Pulsar Helm chart
Apache License 2.0
46 stars 38 forks source link

Autorecovery: configure Pulsar's rack awareness integration #214

Closed michaeljmarshall closed 2 years ago

michaeljmarshall commented 2 years ago

Motivation

By default, Pulsar's rack awareness solution relies on state stored in zookeeper. When autorecovery runs, the client needs to have this metadata in order to follow the placement policy.

This change could technically break deployments that expect the default DNS Resolver: ScriptBasedMapping.

Note: one benefit of this PR is that we'll get rid of this exception that is current seen on bookkeeper and autorecovery startup.

17:26:33.864 [main] ERROR org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to initialize DNS Resolver org.apache.bookkeeper.net.ScriptBasedMapping, used default subnet resolver
java.lang.RuntimeException: No network topology script is found when using script based DNS resolver.
    at org.apache.bookkeeper.net.ScriptBasedMapping$RawScriptBasedMapping.validateConf(ScriptBasedMapping.java:163) ~[com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.net.AbstractDNSToSwitchMapping.setConf(AbstractDNSToSwitchMapping.java:81) ~[com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.net.ScriptBasedMapping.setConf(ScriptBasedMapping.java:123) ~[com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.initialize(RackawareEnsemblePlacementPolicyImpl.java:265) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.initialize(RackawareEnsemblePlacementPolicyImpl.java:80) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.client.BookKeeper.initializeEnsemblePlacementPolicy(BookKeeper.java:581) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.client.BookKeeper.<init>(BookKeeper.java:505) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.client.BookKeeper$Builder.build(BookKeeper.java:306) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.replication.Auditor.createBookKeeperClient(Auditor.java:280) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.replication.AutoRecoveryMain.<init>(AutoRecoveryMain.java:95) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.server.service.AutoRecoveryService.<init>(AutoRecoveryService.java:41) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.replication.AutoRecoveryMain.buildAutoRecoveryServer(AutoRecoveryMain.java:358) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.replication.AutoRecoveryMain.doMain(AutoRecoveryMain.java:326) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
    at org.apache.bookkeeper.replication.AutoRecoveryMain.main(AutoRecoveryMain.java:308) [com.datastax.oss-bookkeeper-server-4.14.4.1.0.0.jar:4.14.4.1.0.0]
eolivelli commented 2 years ago

Did you make some manual testing? The patch looks good but I am not sure that that class is meant to run inside the auto recovery daemon. My concern is more about the pluging with the rest of the Pulsar runtime.

We should do some manual testing and ensure that it is working properly. It won't be easy.

Maybe there is already some unit test or integration test in Lulsad repo

michaeljmarshall commented 2 years ago

@eolivelli - I manually verified that the autorecovery pod and the bookkeeper pod start. I didn't test the autorecovery code path yet, but I can. Without this configuration change, the auto recovery process won't have the rack information that is configured in zookeeper.

michaeljmarshall commented 2 years ago

@eolivelli - I did some additional validation tonight, and everything appears to work correctly. However, I am not an expert on autorecover, so please let me know if I've missed an important case. In the test, I set up 3 racks, 4 bookies, and a topic with a E=2, Qw=2, and Qa=2. The test shows that the autorecovery pod correctly discovers racks and then identifies when a ensemble is not following the rack placement policy after two bookies are removed. Here are the racks:

pulsar@pulsar-broker-74959d97cd-q7f8j:/pulsar$ bin/pulsar-admin bookies racks-placement
"default    {pulsar-bookkeeper-2.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack2, hostname=null), pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack0, hostname=null), pulsar-bookkeeper-3.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack3, hostname=null), pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack0, hostname=null)}"

The autorecovery pod logged the following after I completed configuring the racks:

01:56:25.498 [main-EventThread] INFO  org.apache.pulsar.zookeeper.ZooKeeperDataCache - [State:CONNECTED Timeout:30000 sessionid:0x1000030d905001f local:/172.17.0.5:43494 remoteserver:pulsar-zookeeper-ca.default.svc.cluster.local/10.97.165.234:2181 lastZxid:242 xid:4 sent:17 recv:20 queuedpkts:0 pendingresp:0 queuedevents:0] Received ZooKeeper watch event: WatchedEvent state:SyncConnected type:NodeDataChanged path:/bookies
01:56:25.500 [AuditorBookie-172.17.0.5:3181-EventThread] INFO  org.apache.pulsar.zookeeper.ZooKeeperDataCache - [State:CONNECTED Timeout:30000 sessionid:0x1000030d9050021 local:/172.17.0.5:43504 remoteserver:pulsar-zookeeper-ca.default.svc.cluster.local/10.97.165.234:2181 lastZxid:242 xid:4 sent:17 recv:20 queuedpkts:0 pendingresp:0 queuedevents:0] Received ZooKeeper watch event: WatchedEvent state:SyncConnected type:NodeDataChanged path:/bookies
01:56:25.511 [AuditorBookie-172.17.0.5:3181-EventThread] INFO  org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping - Reloading the bookie rack affinity mapping cache.
01:56:25.512 [main-EventThread] INFO  org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping - Reloading the bookie rack affinity mapping cache.
01:56:25.516 [ForkJoinPool.commonPool-worker-5] INFO  org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping - Bookie rack info updated to {default={pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack1, hostname=null), pulsar-bookkeeper-2.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack2, hostname=null), pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack0, hostname=null)}}. Notifying rackaware policy.
01:56:25.518 [ForkJoinPool.commonPool-worker-3] INFO  org.apache.pulsar.zookeeper.ZkBookieRackAffinityMapping - Bookie rack info updated to {default={pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack1, hostname=null), pulsar-bookkeeper-2.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack2, hostname=null), pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181=BookieInfoImpl(rack=rack0, hostname=null)}}. Notifying rackaware policy.
01:56:25.528 [ForkJoinPool.commonPool-worker-5] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: /rack1/pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.530 [ForkJoinPool.commonPool-worker-3] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: /rack1/pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.531 [ForkJoinPool.commonPool-worker-3] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Adding a new node: /rack1/pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.530 [ForkJoinPool.commonPool-worker-5] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Adding a new node: /rack1/pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.539 [ForkJoinPool.commonPool-worker-3] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: /rack2/pulsar-bookkeeper-2.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.539 [ForkJoinPool.commonPool-worker-3] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Adding a new node: /rack2/pulsar-bookkeeper-2.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.539 [ForkJoinPool.commonPool-worker-5] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: /rack2/pulsar-bookkeeper-2.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.540 [ForkJoinPool.commonPool-worker-5] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Adding a new node: /rack2/pulsar-bookkeeper-2.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.541 [ForkJoinPool.commonPool-worker-5] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: /default-rack/pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.542 [ForkJoinPool.commonPool-worker-5] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Adding a new node: /rack0/pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.542 [ForkJoinPool.commonPool-worker-3] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Removing a node: /default-rack/pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181
01:56:25.543 [ForkJoinPool.commonPool-worker-3] INFO  org.apache.bookkeeper.net.NetworkTopologyImpl - Adding a new node: /rack0/pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181

Here are the relevant logs from autorecovery when I removed bookies 2 and 3:

02:09:24.067 [ReplicationWorker] WARN  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies [<Bookie:pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181>, <Bookie:pulsar-bookkeeper-3.pulsar-bookkeeper.default.svc.cluster.local:3181>, <Bookie:pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181>], allBookies [<Bookie:pulsar-bookkeeper-1.pulsar-bookkeeper.default.svc.cluster.local:3181>, <Bookie:pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181>].
02:09:24.068 [ReplicationWorker] WARN  org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl - Failed to choose a bookie: excluded [<Bookie:pulsar-bookkeeper-0.pulsar-bookkeeper.default.svc.cluster.local:3181>, <Bookie:pulsar-bookkeeper-3.pulsar-bookkeeper.default.svc.cluster.local:3181>], fallback to choose bookie randomly from the cluster.
02:09:24.068 [ReplicationWorker] INFO  org.apache.bookkeeper.client.LedgerFragmentReplicator - Replicating fragment Fragment(LedgerID: 28, FirstEntryID: 0[0], LastKnownEntryID: 1[1], Host: [pulsar-bookkeeper-3.pulsar-bookkeeper.default.svc.cluster.local:3181], Closed: true) in 1 sub fragments.