leporo / tornado-redis

Asynchronous Redis client that works within Tornado IO loop.
666 stars 162 forks source link

Client can't be shared on concurrent conditions #83

Open kaiix opened 9 years ago

kaiix commented 9 years ago

When tornado-redis used as a shared client, it will sometimes crash in process_data

Test code

from tornado.ioloop import IOLoop
from tornado.web import RequestHandler, Application, url, asynchronous
from tornado.gen import engine, Task
from tornadoredis import Client

rds = Client()

class RedisTestHandler(RequestHandler):

    @asynchronous
    @engine
    def get(self):
        yield Task(rds.get, 'foo')
        self.finish()

if __name__ == '__main__':
    from redis import Redis
    Redis().set('foo', 'bar')

    app = Application([
        url(r'/', RedisTestHandler),
    ], debug=True)
    app.listen(9000)
    IOLoop.instance().start()

Test it with apache-bench

ab -c2 -n2 http://localhost:9000/

the crash log:

[I 2015-01-09 14:14:34,317 web:1811] 200 GET / (127.0.0.1) 6.19ms
[I 2015-01-09 14:14:34,319 web:1811] 200 GET / (127.0.0.1) 4.72ms
[I 2015-01-09 14:14:34,823 web:1811] 200 GET / (127.0.0.1) 3.03ms
[I 2015-01-09 14:14:34,825 web:1811] 200 GET / (127.0.0.1) 4.30ms
[I 2015-01-09 14:14:35,775 web:1811] 200 GET / (127.0.0.1) 2.74ms
[I 2015-01-09 14:14:35,777 web:1811] 200 GET / (127.0.0.1) 3.84ms
[I 2015-01-09 14:14:36,365 web:1811] 200 GET / (127.0.0.1) 2.38ms
[I 2015-01-09 14:14:36,367 web:1811] 200 GET / (127.0.0.1) 3.17ms
[I 2015-01-09 14:14:36,924 web:1811] 200 GET / (127.0.0.1) 3.16ms
[I 2015-01-09 14:14:36,926 web:1811] 200 GET / (127.0.0.1) 4.37ms
[I 2015-01-09 14:14:37,409 web:1811] 200 GET / (127.0.0.1) 2.62ms
[I 2015-01-09 14:14:37,412 web:1811] 200 GET / (127.0.0.1) 4.56ms
[I 2015-01-09 14:14:37,891 web:1811] 200 GET / (127.0.0.1) 2.56ms
[I 2015-01-09 14:14:37,893 web:1811] 200 GET / (127.0.0.1) 3.69ms
[E 2015-01-09 14:14:38,368 web:1407] Uncaught exception GET / (127.0.0.1)
HTTPServerRequest(protocol='http', host='localhost:9000', method='GET', uri='/', version='HTTP/1.0', remote_ip='127.0.0.1', headers={'Host': 'localhost:9000', 'Accept': '*/*', 'User-Agent': 'ApacheBench/2.3'})
Traceback (most recent call last):
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/web.py", line 1288, in _stack_context_handle_exception
    raise_exc_info((type, value, traceback))
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/stack_context.py", line 314, in wrapped
    ret = fn(*args, **kwargs)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 708, in <lambda>
    self.future, lambda f: self.run())
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 657, in run
    self.result_future.set_exc_info(sys.exc_info())
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 167, in set_exc_info
    self.set_exception(exc_info[1])
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 150, in set_exception
    self._set_done()
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 177, in _set_done
    cb(self)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 108, in final_callback
    if future.result() is not None:
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 109, in result
    raise_exc_info(self._exc_info)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 631, in run
    yielded = self.gen.throw(*sys.exc_info())
  File "redis_test.py", line 22, in get
    yield Task(rds.get, 'foo')
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 628, in run
    value = future.result()
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 109, in result
    raise_exc_info(self._exc_info)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/stack_context.py", line 314, in wrapped
    ret = fn(*args, **kwargs)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 708, in <lambda>
    self.future, lambda f: self.run())
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 657, in run
    self.result_future.set_exc_info(sys.exc_info())
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 167, in set_exc_info
    self.set_exception(exc_info[1])
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 150, in set_exception
    self._set_done()
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 177, in _set_done
    cb(self)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 108, in final_callback
    if future.result() is not None:
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/concurrent.py", line 109, in result
    raise_exc_info(self._exc_info)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornado/gen.py", line 633, in run
    yielded = self.gen.send(value)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornadoredis/client.py", line 441, in execute_command
    resp = self.process_data(data, cmd_line)
  File "/srv/venv/dev/local/lib/python2.7/site-packages/tornadoredis/client.py", line 490, in process_data
    cmd_line)
ResponseError: ResponseError (on GET [('foo',), {}]): Unknown response type b
[E 2015-01-09 14:14:38,377 web:1811] 500 GET / (127.0.0.1) 10.02ms
[I 2015-01-09 14:14:38,378 web:1811] 200 GET / (127.0.0.1) 11.67ms

Environment: tornado==4.0.2 tornado-redis==2.4.18

I think the problem is, when parsing the redis response, especially Bulk Strings, execute_command splits the reading operation into two separate tasks, one for the response type and length (eg. "$5\r\n"), and a second task to get the rest data. But tornado's IOLoop may take task-switch between two reading tasks, under this condition the shared connection may be read twice both for the response type, and the second task may consume wrong response type.

Generally, a get command will execute like this:

write command (W) -> get response type and length -> (RC) -> get rest response data (RD)

in concurrent condition, 2 concurrent requests for example, a connection may be accessed in this order

W1 -> W2 -> RC1 -> RC2 (crash, bad response type) ...

ydaniv commented 9 years ago

I'm getting similar errors using tornado==4.0.2 and tornado-redis==2.4.18 and all the way to 2.4.10.

Traceback (most recent call last):
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/web.py", line 1334, in _execute
        result = yield result
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 628, in run
        value = future.result()
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 109, in result
        raise_exc_info(self._exc_info)
      File "<string>", line 3, in raise_exc_info
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 631, in run
        yielded = self.gen.throw(*sys.exc_info())
      File "/home/hamus/dev/game-server/handlers/v1/games.py", line 252, in get
        pool = yield Task(self.redis_client.hgetall, "pool:{0}".format(game['id']))
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 628, in run
        value = future.result()
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 109, in result
        raise_exc_info(self._exc_info)
      File "<string>", line 3, in raise_exc_info
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/stack_context.py", line 314, in wrapped
        ret = fn(*args, **kwargs)
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 708, in <lambda>
        self.future, lambda f: self.run())
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 657, in run
        self.result_future.set_exc_info(sys.exc_info())
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 167, in set_exc_info
        self.set_exception(exc_info[1])
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 150, in set_exception
        self._set_done()
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 177, in _set_done
        cb(self)
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 108, in final_callback
        if future.result() is not None:
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 109, in result
        raise_exc_info(self._exc_info)
      File "<string>", line 3, in raise_exc_info
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 631, in run
        yielded = self.gen.throw(*sys.exc_info())
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornadoredis/client.py", line 425, in execute_command
        resp = yield gen.Task(resp)
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 628, in run
        value = future.result()
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 109, in result
        raise_exc_info(self._exc_info)
      File "<string>", line 3, in raise_exc_info
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/stack_context.py", line 314, in wrapped
        ret = fn(*args, **kwargs)
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 708, in <lambda>
        self.future, lambda f: self.run())
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 657, in run
        self.result_future.set_exc_info(sys.exc_info())
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 167, in set_exc_info
        self.set_exception(exc_info[1])
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 150, in set_exception
        self._set_done()
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 177, in _set_done
        cb(self)
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 108, in final_callback
        if future.result() is not None:
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/concurrent.py", line 109, in result
        raise_exc_info(self._exc_info)
      File "<string>", line 3, in raise_exc_info
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornado/gen.py", line 633, in run
        yielded = self.gen.send(value)
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornadoredis/client.py", line 485, in consume_multibulk
        token = self.process_data(data, cmd_line)
      File "/home/hamus/envs/game-server/lib/python3.4/site-packages/tornadoredis/client.py", line 472, in process_data
        cmd_line)
    tornadoredis.exceptions.ResponseError: ResponseError (on HGETALL [('pool:cc5000b6-d4ec-44b7-8c87-86942c1fd82a',), {}]): Unknown response type s

It seems that once I try to send more than 1 concurrent request from a client to tornado it somehow messes up the things and tornaod doesn't seem to be able to query that data properly from that point on.

Sending a single request to tornado works fine.

desertkun commented 9 years ago

Having the same as well. tornado-redis (2.4.18) / tornado (4.2.1)

gitchs commented 8 years ago

workaround soluction: https://github.com/leporo/tornado-redis#connection-pool-support

zjjott commented 8 years ago

@gitchs I used connection-poll but have the same question. It look like tornado-redis always read begin/end char like $20 or p0