ConnectionError: Error 32 while writing to socket. Broken pipe

NadavkOptimalQ commented 4 years ago

Hi, We are running using python 2.7 on AKS (azure kubernetes service) and have a redis cluster deployed in our AKS deployed with kubedb. Due to some issue between azure load balancer and confluent kafka we added an init container that does the following:

sysctl -w net.ipv4.tcp_keepalive_time=60
sysctl -w net.ipv4.tcp_keepalive_intvl=30

Since introducing the init container every few minutes we get this error:

Traceback (most recent call last):
  File \"/usr/local/lib/python2.7/dist-packages/rediscluster/client.py\", line 625, in _execute_command
  connection.send_command(*args)
  File \"/usr/local/lib/python2.7/dist-packages/redis/connection.py\", line 619, in send_command
    self.send_packed_command(self.pack_command(*args))
  File \"/usr/local/lib/python2.7/dist-packages/redis/connection.py\", line 612, in send_packed_command
    (errno, errmsg))
ConnectionError: Error 32 while writing to socket. Broken pipe.

As far as I see it this error is self contained within the library and doesn't actually cause any issues with our system, please correct me if i'm wrong We have a few services running in that kubernetes cluster who work with the redis cluster that are not experiencing this issue. The only thing different about the services that is experiencing this issue is that it is running with multi processing.

Any help in resolving this issue would be much appreciated. Please let me know if there is more information I can provide you with

Grokzen commented 4 years ago

@NadavkOptimalQ Based on previous issues inside redis-py itself, the problem might be that you have to update the config option health_check_interval into the RedisCluster instance. If that dont work you have to investigate any other keepalive or timings/timeouts that is hardcoded to some value inside redis-py and update them to stop this error from happening. My guess is, that your system has terminated the socket on the system level and not really informed the python level, so when you try to write to the socket your system has terminated it already for any reason.

Also note that this error is not really inside this lib since this lib only wraps around redis-py that actually performs the socket handling etc, and you can see that part in your stack trace where it happens inside redis/connection.py. So unless you find a solution by your own tinkering i will advise you to submit this error up to redis-py repo instead to solve it there, or attempt to submit it at the google groups or stack-overflow for any assistance with this.

NadavkOptimalQ commented 4 years ago

Thank you. Can you tell me how can I set health_check_interval using redis-py-cluster?

Grokzen commented 4 years ago

@NadavkOptimalQ RedisCluster(health_check_interval=10)

NadavkOptimalQ commented 4 years ago

I ran this:

from rediscluster import RedisCluster
nodes =  [
      {
        "host": "127.0.0.1",
        "port": "7000"
      },
      {
        "host": "127.0.0.1",
        "port": "7001"
      },
      {
        "host": "127.0.0.1",
        "port": "7002"
      }
    ]

r = RedisCluster(startup_nodes=nodes, health_check_interval=10)

And I got this error:

  File "/snap/pycharm-community/214/plugins/python-ce/helpers/pydev/pydevd.py", line 1448, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/snap/pycharm-community/214/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/nadav/workspace/sandbox/main.py", line 60, in <module>
    r = RedisCluster(startup_nodes=nodes, health_check_interval=10)
  File "/home/nadav/workspace/sandbox/venv/lib/python3.6/site-packages/rediscluster/client.py", line 217, in __init__
    super(RedisCluster, self).__init__(connection_pool=pool, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'health_check_interval'

Runnin ubuntu 18.04 python 3.6 with hiredis 1.0.1 redis 3.0.1 redis-py-cluster 2.1.0

Grokzen commented 4 years ago

@NadavkOptimalQ That variable was only added in redis 3.3.0 so you need to run that version in order to have access to that variable. https://github.com/andymccurdy/redis-py/blob/b80d423cc531c5db972d35ba72424cf9cf8772ff/CHANGES#L169

NadavkOptimalQ commented 4 years ago

This fixed our issue. Thank you very much!

If anyone runs into the same issue we upgraded to new redis version and set health_check_interval=20

Grokzen / redis-py-cluster

ConnectionError: Error 32 while writing to socket. Broken pipe #407