Open jihwong opened 11 months ago
I configured useHostNameAsBookieID=true
, but it seems to be unrelated to the issue.
Assign a separate /bookkeeper/data directory for each k8s stat
such: /bookkeeper/bookie-sts-0-data /bookkeeper/bookie-sts-1-data /bookkeeper/bookie-sts-2-data
AFAIK InstanceId is only set when a new cluster is instantiated, not during restarts.
So you either have an installation which has an existing instance id in the registration manager, or your k8s init command is attempting to reinitialize the cluster.
Can you please check the logs for this line Initializing ZooKeeper metadata for new cluster
.
It should indicate a new cluster id is being set.
cc @eolivelli
closed by no updates, feel free to reopen it if you have new informations or way to reproduce.
2024-09-05T11:48:46,431+0800 [main] INFO org.apache.zookeeper.common.X509Util - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
2024-09-05T11:48:46,597+0800 [main] INFO org.apache.zookeeper.common.X509Util - Default TLS protocol is TLSv1.3, supported TLS protocols are [TLSv1.3, TLSv1.2, TLSv1.1, TLSv1, SSLv3, SSLv2Hello]
2024-09-05T11:48:46,604+0800 [main] INFO org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 10485760 Bytes
2024-09-05T11:48:46,611+0800 [main] INFO org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled=false
2024-09-05T11:48:46,619+0800 [main-SendThread(shanghai:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server shanghai/172.21.0.1:2181.
2024-09-05T11:48:46,619+0800 [main-SendThread(shanghai:2181)] INFO org.apache.zookeeper.ClientCnxn - SASL config status: Will not attempt to authenticate using SASL (unknown error)
2024-09-05T11:48:46,627+0800 [main-SendThread(shanghai:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established, initiating session, client: /172.21.0.4:55006, server: shanghai/172.21.0.1:2181
2024-09-05T11:48:46,643+0800 [main-SendThread(shanghai:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server shanghai/172.21.0.1:2181, session id = 0x10003bc3feb0040, negotiated timeout = 30000
2024-09-05T11:48:46,647+0800 [main-EventThread] INFO org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase - ZooKeeper client is connected now.
2024-09-05T11:48:46,718+0800 [main] INFO org.apache.bookkeeper.bookie.UncleanShutdownDetectionImpl - Dirty files exist on boot-up indicating an unclean shutdown. Dirty files: /data/bookkeeper/ledgers/current/DIRTY
2024-09-05T11:48:46,718+0800 [main] INFO org.apache.bookkeeper.server.EmbeddedServer$Builder - Unclean shutdown detected. The bookie did not register a graceful shutdown prior to this boot.
2024-09-05T11:48:46,855+0800 [main] ERROR org.apache.bookkeeper.server.Main - Failed to build bookie server
org.apache.bookkeeper.bookie.BookieException$InvalidCookieException: instanceId 84167942-9ab9-4c54-a8fd-8e3179417f97 is not matching with null
at org.apache.bookkeeper.bookie.Cookie.verifyInternal(Cookie.java:168) ~[org.apache.bookkeeper-bookkeeper-server-4.16.6.jar:4.16.6]
at org.apache.bookkeeper.bookie.Cookie.verify(Cookie.java:173) ~[org.apache.bookkeeper-bookkeeper-server-4.16.6.jar:4.16.6]
at org.apache.bookkeeper.bookie.LegacyCookieValidation.readAndVerifyCookieFromRegistrationManager(LegacyCookieValidation.java:179) ~[org.apache.bookkeeper-bookkeeper-server-4.16.6.jar:4.16.6]
at org.apache.bookkeeper.bookie.LegacyCookieValidation.checkCookies(LegacyCookieValidation.java:72) ~[org.apache.bookkeeper-bookkeeper-server-4.16.6.jar:4.16.6]
at org.apache.bookkeeper.server.EmbeddedServer$Builder.build(EmbeddedServer.java:408) ~[org.apache.bookkeeper-bookkeeper-server-4.16.6.jar:4.16.6]
at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:277) ~[org.apache.bookkeeper-bookkeeper-server-4.16.6.jar:4.16.6]
at org.apache.bookkeeper.server.Main.doMain(Main.java:216) ~[org.apache.bookkeeper-bookkeeper-server-4.16.6.jar:4.16.6]
at org.apache.bookkeeper.server.Main.main(Main.java:199) ~[org.apache.bookkeeper-bookkeeper-server-4.16.6.jar:4.16.6]
I encounter the same issue here, I'm using Pulsar LTS 3.0.6
BUG REPORT
Describe the bug
Error reported after restarting the bookkeeper node. Due to being a Kubernetes environment, it was successfully started after being restarted x times. Why is there a new instanceId
df26468b-558a-41fc-8705-3d7dc21d2817
log
zookeeper
data dir
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Screenshots
Additional context