jonathanslenders / asyncio-redis

Redis client for Python asyncio (PEP 3156)
http://asyncio-redis.readthedocs.org/
Other
553 stars 73 forks source link

File descriptor leak when can't connect to host #29

Open tumb1er opened 10 years ago

tumb1er commented 10 years ago

There is a small problem with BaseEventLoop.create_connection() method: it doesn't have connection timeout. For example, this code never finishes because of non-blocking client socket:

import asyncio
from asyncio_redis import Connection
@asyncio.coroutine
def test():
  c = yield from Connection.create('22.0.0.22')
  print("Connection created")

asyncio.get_event_loop().run_until_complete(test())

Next step to fix this problem was using asyncio.wait_for to add connection timeout. Program now continues with asyncio.futures.TimeoutError but it leaves open file descriptor for socket, created in BaseEventLoop.create_connection() method.

asyncio.get_event_loop().run_until_complete(
  asyncio.wait_for(test(), timeout=1))

But (thanks for asyncio authors) we have an optional argument sock in this create_connection. I solved my connection problem with creating socket myself and passing it to create_connection instead of host and port.

def connect(host):
  sock = ...  # here is a lot of code from create_connection to resolve host, open non-blocking socket and handle exceptions.
  try:
    yield from asyncio.wait_for(loop.sock_connect(sock, address), 
      timeout=1)
    yield from asyncio.wait_for(loop.create_connection(
      protocol_factory, sock=sock), timeout=1)
  except:
    try:
      loop.remove_writer(sock.fileno())
      sock.close()
    except:
      pass
jonathanslenders commented 10 years ago

Is there any other clean way to solve this issue? I mean without copying half of the code in create_connection? Probably, we're not the only ones that need timeouts in the create_connection...

It's certainly an issue we have to fix.

tumb1er commented 10 years ago

Socket opened in create_connection is local variable, so you can't without hacking find out what file descriptor to close. Other possible solutions are: 1) add timeout to asyncio in python lib, if Guido will accept this :) - what about python3.3 and python3.4? 2) subclass BaseEventLoop (or add mixin) - not very usable 3) find out the way to find correct file descriptor and close it - not sure that is possible because create_connection itself is async method so multiple sockets could be opened while it is executed.

I've chosen to copy-paste first half of create_connection as shortest way to fix this, but not sure about asyncio-redis.

jonathanslenders commented 10 years ago

Why would we ever have to hard-code a timeout for the connection creation? I don't say we shouldn't, but I don't understand it completely. When does it happen that it would otherwise block for ever?

What I could do is extend the signature of Connection.create to accept a socket or socket factory. Or maybe create a Connection.create_from_socket instead, although I'm not sure about the reconnect logic in that case either. (Giving a socket factory as parameter would allow us to reconnect.)

For now, I prefer not to change too much until I completely understand the issue and I know what parts people expect the library to handle, and which parts of timeouts and socket creation people usually prefer to do themselves.

jonathanslenders commented 10 years ago

See: https://groups.google.com/forum/#!topic/python-tulip/iwpA-8IK7NQ

tumb1er commented 10 years ago

Why would we ever have to hard-code a timeout for the connection creation? When does it happen that it would otherwise block for ever? For example, closed port on remote firewall will cause socket.connect to wait for TCP_ACK packet for about 2 minutes (it's operation system setting, I suppose). When your async http server is under high load, you'll accumulate all these clients (with memory and socket usage growth) and force them all to wait. And you have nothing to do with it without connection timeout at application level.