coheigea / testcases

Various testcases for some open source projects
68 stars 91 forks source link

ZOOKEEPER: cnxn object did not initialize its saslServer properly #15

Closed natedogs911 closed 7 years ago

natedogs911 commented 7 years ago

Hello,

I'm following this post: http://coheigea.blogspot.com/2017/05/securing-apache-kafka-with-kerberos.html

and have a kdc running on my local machine (Mac OS high sierra)

the scenario is as follows:

  1. the KDC is up and I can authenticate with client
  2. launch zookeeper without error
  3. launch kafka and get an authentication error to zookeeper

NOTE: some logs captured at different times, but same error messages regardless

[2017-10-24 12:49:53,819] INFO Socket connection established to localhost/0:0:0:0:0:0:0:1:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2017-10-24 12:49:53,839] INFO Session establishment complete on server localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x15f4fece1660000, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2017-10-24 12:49:53,840] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2017-10-24 12:49:53,892] ERROR SASL authentication failed using login context 'Client'. (org.apache.zookeeper.client.ZooKeeperSaslClient)
[2017-10-24 12:49:53,892] INFO zookeeper state changed (AuthFailed) (org.I0Itec.zkclient.ZkClient)
[2017-10-24 12:49:53,893] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2017-10-24 12:49:53,894] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.I0Itec.zkclient.exception.ZkAuthFailedException: Authentication failure
    at org.I0Itec.zkclient.ZkClient.waitForKeeperState(ZkClient.java:947)
    at org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:924)
    at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1231)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:157)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:131)
    at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:103)
    at kafka.utils.ZkUtils$.apply(ZkUtils.scala:85)
    at kafka.server.KafkaServer.initZk(KafkaServer.scala:338)
    at kafka.server.KafkaServer.startup(KafkaServer.scala:191)
    at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
    at kafka.Kafka$.main(Kafka.scala:65)
    at kafka.Kafka.main(Kafka.scala)
[2017-10-24 12:49:53,896] INFO shutting down (kafka.server.KafkaServer)
[2017-10-24 12:49:53,900] INFO shut down completed (kafka.server.KafkaServer)
[2017-10-24 12:49:53,900] FATAL Exiting Kafka. (kafka.server.KafkaServerStartable)
[2017-10-24 12:49:53,902] INFO shutting down (kafka.server.KafkaServer)

zookeeper:

[2017-10-24 12:51:20,294] INFO Established session 0x15f4fece1660001 with negotiated timeout 30000 for client /0:0:0:0:0:0:0:1:55895 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-10-24 12:51:20,361] ERROR cnxn.saslServer is null: cnxn object did not initialize its saslServer properly. (org.apache.zookeeper.server.ZooKeeperServer)
[2017-10-24 12:51:20,719] WARN caught end of stream exception (org.apache.zookeeper.server.NIOServerCnxn)
EndOfStreamException: Unable to read additional data from client sessionid 0x15f4fece1660001, likely client has closed socket
    at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
    at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
    at java.lang.Thread.run(Thread.java:748)
[2017-10-24 12:51:20,720] INFO Closed socket connection for client /0:0:0:0:0:0:0:1:55895 which had sessionid 0x15f4fece1660001 (org.apache.zookeeper.server.NIOServerCnxn)
[2017-10-24 12:51:51,003] INFO Expiring session 0x15f4fece1660001, timeout of 30000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2017-10-24 12:51:51,003] INFO Processed session termination for sessionid: 0x15f4fece1660001 (org.apache.zookeeper.server.PrepRequestProcessor)
natedogs911 commented 7 years ago

Also, I'm working on some scripts to automate the launch of the KDC, Kafka/zookeeper instance to be used in a set of docker images etc. not sure if you are interested in a PR if I can get it all working smoothly

coheigea commented 7 years ago

Can you see if Kafka is hitting Kerby to get a ticket? You should be able to see some output on the Kerby console when Kafka is started.

natedogs911 commented 7 years ago

here's an example KDC log,

Running org.apache.coheigea.bigdata.kerberos.kafka.KafkaKerbyTest
KDC ready on port: 56632
Oct 24, 2017 2:15:43 PM org.apache.kerby.kerberos.kerb.server.request.KdcRequest preauth
INFO: The preauth data is empty.
Oct 24, 2017 2:15:43 PM org.apache.kerby.kerberos.kerb.server.KdcHandler handleRecoverableException
INFO: KRB error occurred while processing request:Additional pre-authentication required
Oct 24, 2017 2:15:44 PM org.apache.kerby.kerberos.kerb.server.request.AsRequest issueTicket
INFO: AS_REQ ISSUE: authtime 1508879744127,client@spot.local for krbtgt/spot.local@spot.local
Oct 24, 2017 2:15:44 PM org.apache.kerby.kerberos.kerb.server.request.TgsRequest issueTicket
INFO: TGS_REQ ISSUE: authtime 1508879744227,client for zookeeper/localhost@spot.local
Oct 24, 2017 2:34:55 PM org.apache.kerby.kerberos.kerb.server.impl.DefaultInternalKdcServerImpl doStop
INFO: Default Internal kdc server stopped.
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1,200.324 sec
Running org.apache.coheigea.bigdata.kerberos.hadoop.HadoopKerbyTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0 sec
Running org.apache.coheigea.bigdata.kerberos.hive.HiveKerbyTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0 sec
Running org.apache.coheigea.bigdata.kerberos.storm.StormKerbyTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 0 sec
coheigea commented 7 years ago

What I meant was look at the Kerby window + start the Kafka broker and then see if there is some output in the Kerby window to indicate that it issued a token successfully.

natedogs911 commented 7 years ago

from the log it seems that zookeeper authenticates, but only when the Kafka instance fires up. I don't see service tickets when running klist, only krbtgt/local.

the sequence of events are:

  1. start kdc

  2. able to kinit with client

  3. start zookeeper instance last log from zk:

    [2017-10-25 23:08:01,850] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)

    everything is stable at this point

  4. start kafka, after a few seconds fatal error:

    [2017-10-25 23:10:33,685] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
    org.I0Itec.zkclient.exception.ZkAuthFailedException: Authentication failure
    at org.I0Itec.zkclient.ZkClient.waitForKeeperState(ZkClient.java:947)
    at org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:924)
    at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1231)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:157)
    at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:131)
    at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:103)
    at kafka.utils.ZkUtils$.apply(ZkUtils.scala:85)
    at kafka.server.KafkaServer.initZk(KafkaServer.scala:338)
    at kafka.server.KafkaServer.startup(KafkaServer.scala:191)
    at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
    at kafka.Kafka$.main(Kafka.scala:65)
    at kafka.Kafka.main(Kafka.scala)
  5. log from zookeeper at that time

    [2017-10-25 23:10:33,658] INFO Accepted socket connection from /127.0.0.1:52559 (org.apache.zookeeper.server.NIOServerCnxnFactory)
    [2017-10-25 23:10:33,670] INFO Client attempting to establish new session at /127.0.0.1:52559 (org.apache.zookeeper.server.ZooKeeperServer)
    [2017-10-25 23:10:33,671] INFO Creating new log file: log.46 (org.apache.zookeeper.server.persistence.FileTxnLog)
    [2017-10-25 23:10:33,678] INFO Established session 0x15f574b6d700000 with negotiated timeout 6000 for client /127.0.0.1:52559 (org.apache.zookeeper.server.ZooKeeperServer)
    [2017-10-25 23:10:33,681] INFO Processed session termination for sessionid: 0x15f574b6d700000 (org.apache.zookeeper.server.PrepRequestProcessor)
    [2017-10-25 23:10:33,684] INFO Closed socket connection for client /127.0.0.1:52559 which had sessionid 0x15f574b6d700000 (org.apache.zookeeper.server.NIOServerCnxn)
coheigea commented 7 years ago

When I start Zookeeper, I see the following output in the Kerby console:

INFO: The preauth data is empty. Oct 26, 2017 12:37:47 PM org.apache.kerby.kerberos.kerb.server.KdcHandler handleRecoverableException INFO: KRB error occurred while processing request:Additional pre-authentication required Oct 26, 2017 12:37:48 PM org.apache.kerby.kerberos.kerb.server.request.AsRequest issueTicket INFO: AS_REQ ISSUE: authtime 1509017868123,zookeeper/localhost@kafka.apache.org for krbtgt/kafka.apache.org@kafka.apache.org

and in the Zookeeper console:

Client Principal = zookeeper/localhost@kafka.apache.org Server Principal = krbtgt/kafka.apache.org@kafka.apache.org Session Key = EncryptionKey: keyType=17 keyBytes (hex dump)= 0000: AD 3B F2 15 82 50 86 20 23 ED 9F 29 82 69 9B C8 .;...P. #..).i..

Forwardable Ticket false Forwarded Ticket false Proxiable Ticket false Proxy Ticket false Postdated Ticket false Renewable Ticket false Initial Ticket false Auth Time = Thu Oct 26 12:37:48 IST 2017 Start Time = Thu Oct 26 12:37:48 IST 2017 End Time = Fri Oct 27 12:37:48 IST 2017 Renew Till = null Client Addresses Null . (org.apache.zookeeper.Login) [2017-10-26 12:37:48,155] INFO TGT valid starting at: Thu Oct 26 12:37:48 IST 2017 (org.apache.zookeeper.Login) [2017-10-26 12:37:48,155] INFO TGT expires: Fri Oct 27 12:37:48 IST 2017 (org.apache.zookeeper.Login)

Are you sure that you see no output from Zookeeper as per above?

natedogs911 commented 7 years ago

Thank you, I found the error in my zookeeper client configuration. I had been over the config 4-5 times to verify I had everything correct, however I had missed the krb5.conf variable for zookeeper was pointing to the wrong directory.

now its just to testing the client configuration at application level.