python-zk / kazoo

Kazoo is a high-level Python library that makes it easier to use Apache Zookeeper.
https://kazoo.readthedocs.io
Apache License 2.0
1.3k stars 386 forks source link

Kazoo 2.9.0 throws "Connection dropped: socket connection error: The handle is invalid" #679

Closed fafairuz closed 8 months ago

fafairuz commented 1 year ago

Kazoo 2.9.0 throws Connection dropped: socket connection error: The handle is invalid and from server, I got EndOfStreamException: Unable to read additional data from client, it probably closed the socket: address = /10.1.17.34:3664, session = 0x0

I have tried with previous version (2.8.0) and it is working fine.

Expected Behavior

KazooClient can be started after issuing zk.start()

Actual Behavior

Kazoo returning Connection dropped: socket connection error: The handle is invalid after issuing zk.start()

Snippet to Reproduce the Problem

Python 3.8.10 (tags/v3.8.10:3d8993a, May  3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from kazoo.client import KazooClient
>>> zk_hosts    = "zookeeper_host_1:2181"
>>> zk = KazooClient(hosts=zk_hosts)
>>> zk.start()
Connection dropped: socket connection error: The handle is invalid
Connection dropped: socket connection error: The handle is invalid
Connection dropped: socket connection error: The handle is invalid
Connection dropped: socket connection error: The handle is invalid
Connection dropped: socket connection error: The handle is invalid
Connection dropped: socket connection error: The handle is invalid
Connection dropped: socket connection error: The handle is invalid
Connection dropped: socket connection error: The handle is invalid
Connection dropped: socket connection error: The handle is invalid
Failed connecting to Zookeeper within the connection retry policy.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python38\lib\site-packages\kazoo\client.py", line 635, in start
    raise self.handler.timeout_exception("Connection time-out")
kazoo.handlers.threading.KazooTimeoutError: Connection time-out
>>>

Logs with logging in DEBUG mode

This is logs from zookeeper host. It is repeated numbers of times.

Nov 03 17:19:39 nsl-zookeeper-02 zkServer.sh[1452]: EndOfStreamException: Unable to read additional data from client, it probably closed the socket: address = /my_client_IP:4089, session = 0x0
Nov 03 17:19:39 nsl-zookeeper-02 zkServer.sh[1452]:         at org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:163)
Nov 03 17:19:39 nsl-zookeeper-02 zkServer.sh[1452]:         at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:326)
Nov 03 17:19:39 nsl-zookeeper-02 zkServer.sh[1452]:         at org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)Nov 03 17:19:39 nsl-zookeeper-02 zkServer.sh[1452]:         at org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
Nov 03 17:19:39 nsl-zookeeper-02 zkServer.sh[1452]:         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
Nov 03 17:19:39 nsl-zookeeper-02 zkServer.sh[1452]:         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
Nov 03 17:19:39 nsl-zookeeper-02 zkServer.sh[1452]:         at java.lang.Thread.run(Thread.java:750)

Specifications

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/path/to/data/zookeeper
clientPort=2181
metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
metricsProvider.httpPort=7000
metricsProvider.exportJvmInfo=true
4lw.commands.whitelist=mntr,conf,ruok
server.1=...:2888:3888
server.2=...:2888:3888
server.3=...:2888:3888
- Java OpenJDK, `openjdk version "1.8.0_352`
- put here any useful ZK configuration (authentication, encryption, number of ZK members, number of (concurrent?) clients, Java version, krb5 version, etc.)

Thank you

StephenSorriaux commented 1 year ago

Hello,

Thanks for reporting this issue.

Thoughts: can it be linked to https://github.com/python-zk/kazoo/pull/656 that may not be working as expected on Windows?

yanchunhuo commented 1 year ago

I have the same problem~

StephenSorriaux commented 1 year ago

Hello,

FWIW, I believe I have a fix in https://github.com/python-zk/kazoo/pull/681

ArneBachmannDLR commented 1 year ago

This issue broke my production server, as I didn't freeze the dependency to 2.8.0... Upgrading ZooKeeper didn't fix the problem either and after thinking/investigating a few days it became clear it's neither client nor server, but the Python library :-)

Workaround: pip install -U "kazoo=2.8.0"

ceache commented 1 year ago

Hi there,

Just to confirm, on what platform was your zookeeper client running?

Thanks!

On Sat, Dec 17, 2022, 13:37 ArneBachmannDLR @.***> wrote:

This issue broke my production server, as I didn't freeze the dependency to 2.8.0... Upgrading ZooKeeper didn't fix the problem either and after thinking/investigating a few days it became clear it's neither client nor server, but the Python library :-)

Workaround: pip install -U "kazoo=2.8.0"

— Reply to this email directly, view it on GitHub https://github.com/python-zk/kazoo/issues/679#issuecomment-1356377572, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIFTHQ24UTR2EZ3I3CBSQ3WNYCANANCNFSM6AAAAAARV5TLJ4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ArneBachmannDLR commented 1 year ago

The client was on Windows 10 21H2 19044.2364 in an Enterprise setup. The ZooKeeper installation same on a different machine. Access via other tools (like ZKUI) worked fine, but not from Python anymore. Downgrade to 2.8.0 resolved the issue for now.

jeffwidman commented 1 year ago

@fafairuz @yanchunhuo are you both running on Windows as well?

yanchunhuo commented 1 year ago

@fafairuz @yanchunhuo are you both running on Windows as well?

yes

jeffwidman commented 8 months ago

@tcalmant @fafairuz @yanchunhuo can one of you test master now that #681 is merged to confirm this is resolved?

We would like to cut a release with the fix to unbreak folks, but first want to confirm this is resolved for you.

tcalmant commented 8 months ago

Hi @jeffwidman, the master branch works in my use case