benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
http://www.gunicorn.org
Other
9.86k stars 1.75k forks source link

"This event is already used by another greenlet" with mysql connector #779

Closed expntly closed 10 years ago

expntly commented 10 years ago

I use gunicorn 17.5 with worker_class="gevent", workers=3, that I start with gunicorn -c config.py my:app. The load is virtually zero. I'm seeing the following:

2014-06-11 06:16:12 [9939] [ERROR] Error handling request
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/gunicorn/workers/async.py", line 45, in handle
    self.handle_request(listener, req, client, addr)
  File "/usr/lib/python2.6/site-packages/gunicorn/workers/ggevent.py", line 119, in handle_request
    super(GeventWorker, self).handle_request(*args)
  File "/usr/lib/python2.6/site-packages/gunicorn/workers/async.py", line 93, in handle_request
    respiter = self.wsgi(environ, resp.start_response)
  File "/usr/lib/python2.6/site-packages/beaker/middleware.py", line 155, in __call__
    return self.wrap_app(environ, session_start_response)
[...]
  File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 1083, in cursor
    if not self.is_connected():
  File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 696, in is_connected
    self.cmd_ping()
  File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 665, in cmd_ping
    return self._handle_ok(self._send_cmd(ServerCmd.PING))
  File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 378, in _send_cmd
    return self._socket.recv()
  File "/usr/lib/python2.6/site-packages/mysql/connector/network.py", line 170, in recv_plain
    packet = self.sock.recv(1)
  File "/usr/lib64/python2.6/site-packages/gevent/socket.py", line 432, in recv
    wait_read(sock.fileno(), timeout=self.timeout, event=self._read_event)
  File "/usr/lib64/python2.6/site-packages/gevent/socket.py", line 165, in wait_read
    assert event.arg is None, 'This event is already used by another greenlet: %r' % (event.arg, )
AssertionError: This event is already used by another greenlet: (<Greenlet at 0x1f58870: <functools.partial object at 0xd14788>(<socket at 0x1f5c450 fileno=16 sock=10.192.29.17:, ('173.171.26.210', 16162))>, timeout('timed out',))

I'm going to assume the above message means the same socket to the mysql server is being used by more than one greenlet at the same time?

A little bit of searching seems to suggest that I might be using some modules that are not monkey patched before being imported or not greenlet compatible. However the mysql connector is pure python and so that seems to imply it is compatible. And worker_class=gevent seems to call monkey.patch_all() when initialized. To be sure, I added import gevent.monkey; gevent.monkey.patch_all() on the first line of my app, but it doesn't help.

In case that can help: my app opens a bunch of mysql connections when it starts but then I make no guarantee that they will be used in a thread-safe way -- i.e. I don't have a connection pool with acquire/release semantics, they're just randomly used by incoming client requests, i.e. different workers. Would it help to have some sort of thread-safe connection pool? If yes, how to ensure this works nicely with gevent/greenlets?

tilgovi commented 10 years ago

How are you starting gunicorn? What version of gevent are you using?

expntly commented 10 years ago

Start is described at the beginning of the description. Are you looking for anything else in particular?

As for gevent version, I'll check in a bit. On Jun 11, 2014 5:49 PM, "Randall Leeds" notifications@github.com wrote:

How are you starting gunicorn? What version of gevent are you using?

— Reply to this email directly or view it on GitHub https://github.com/benoitc/gunicorn/issues/779#issuecomment-45818470.

tilgovi commented 10 years ago

Right. Sorry.

What about the config.py file. Can you share that? Or at least list any imports and whether you're setting the app to preload.

expntly commented 10 years ago

gevent version 0.13.8

config.py is trivial:

bind = "0.0.0.0:8080"
worker_class = "gevent"
workers = 3
loglevel = "debug"
accesslog = "/dev/null"
timeout = 3600

imports:

import gevent.monkey
gevent.monkey.patch_all()

import cgi
import httplib
import os.path
import random
import re
import simplejson as json
import sys
import tempfile
import urllib2

from beaker.middleware import SessionMiddleware
from boto.ec2 import cloudwatch
from Cookie import SimpleCookie
from datetime import datetime as dt
from oauth2client.client import OAuth2WebServerFlow

from common import db   # this module imports mysql.connector

Not sure what you mean by "preload".

The more I think about the problem, the more I'm convinced this is because I'm not using a connection pool?

benoitc commented 10 years ago

@expntly are you reproducing it with latest gunicorn?

expntly commented 10 years ago

I can give it a try. Again though, isn't this supposed to happen when threads share the same socket?

benoitc commented 10 years ago

@expntly are you using the green version of the mysql driver? (Or anything like this with gevent)

expntly commented 10 years ago

Not sure how to tell about the green version of the mysql driver?

Here's some more info about the mysql driver specifically:

% pip show mysql-connector-python
---
Name: mysql-connector-python
Version: 1.0.9
Location: /usr/lib/python2.6/site-packages
Requires:

% python
Python 2.6.9 (unknown, Mar 28 2014, 00:06:37)
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from mysql import connector
>>> help(connector)
Help on package mysql.connector in mysql:

NAME
    mysql.connector - MySQL Connector/Python - MySQL drive written in Python

FILE
    /usr/lib/python2.6/site-packages/mysql/connector/__init__.py

[...]

DATA
    BINARY = <mysql.connector.dbapi._DBAPITypeObject instance>
    DATETIME = <mysql.connector.dbapi._DBAPITypeObject instance>
    NUMBER = <mysql.connector.dbapi._DBAPITypeObject instance>
    ROWID = <mysql.connector.dbapi._DBAPITypeObject instance>
    STRING = <mysql.connector.dbapi._DBAPITypeObject instance>
    __all__ = ['MySQLConnection', 'Connect', 'custom_error_exception', 'Fi...
    apilevel = '2.0'
    paramstyle = 'pyformat'
    threadsafety = 1
benoitc commented 10 years ago

I doubt you can use the mysql driver as is. You probably need to use a compatible driver. Like ultramysql or something like it ( @tilgovi ? )

expntly commented 10 years ago

How do you tell whether it's compatible or not?

Also, I updated my gunicorn to the latest release and I haven't seen the error again. I wouldn't say it's fixed though, since I don't have a way to reproduce.

expntly commented 10 years ago

I took a look at umysql. This is scary (warnings at compile time, some tests commented out, very little documentation, no data binding, etc.). Anyway, this is a little off topic.

I'll take a look at connection pools, it seems like mysql connector 1.1.1 and up has support. They say it is thread safe. I understand that running gunicorn with worker_class="gevent" means I'm using greenlets and not threads. Any reason I should be careful with using a thread-safe module with code that uses greenlets? I'm thinking greenlets are micro-threads and so it should work, but I have very little experience with greenlets or even threading in Python.

edit: I also just upgraded mysql from 1.0.9 to 1.2.2, greenlet from 0.4.1 to 0.4.2, and gevent from 0.13.8 to 1.0.1.

tilgovi commented 10 years ago

Thread safe won't necessarily mean greenlet safe.

Generally, you need a pure python library because if there is C binding code then the library might block in C and block all the greenlets.

It's okay if the library does protocol parsing or serialization in C but not input/output.

If I member correctly, pymysql works well, but I haven't used mysql in a few years. On Jul 6, 2014 3:33 AM, "expntly" notifications@github.com wrote:

I took a look at umysql. This is scary (warnings at compile time, some tests commented out, very little documentation, no data binding, etc.). Anyway, this is a little off topic.

I'll take a look at connection pools, it seems like mysql connector 1.1.1 and up has support. They say it is thread safe. I understand that running gunicorn with worker_class="gevent" means I'm using greenlets and not threads. Any reason I should be careful with using a thread-safe module with code that uses greenlets? I'm thinking greenlets are micro-threads and so it should work, but I have very little experience with greenlets or even threading in Python.

— Reply to this email directly or view it on GitHub https://github.com/benoitc/gunicorn/issues/779#issuecomment-48108058.

expntly commented 10 years ago

Why not [ thread safe => greenlet safe ] ?

How will pymysql (or any greenlet safe mysql module) help when two greenlets use the same socket connection to the db?

tilgovi commented 10 years ago

On Jul 6, 2014 11:48 AM, "expntly" notifications@github.com wrote:

Why not [ thread safe => greenlet safe ]?

It's not likely to be unsafe, just not ideal. It might reduce concurrency a lot.

Thread safe means that each call is using local memory or using locking primitives to control concurrent access to share memory. With more than one thread, the operating system is scheduling preemptively between threads even if they make blocking calls to input or output on a socket or file.

Greenlets are all one thread as far as C code is concerned. Gevent cannot patch the system calls the library might make for input and output. So if one greenlet calls the library, the whole application might block until it is done.

It's not "unsafe" but it's not necessarily greenlet friendly.

How will pymysql (or any greenlet safe mysql module) help when two greenlets use the same socket connection to the db?

It won't, by itself. You'll need to use a connection pool for that, regardless.

expntly commented 10 years ago

FYI -- confirmed that this is because of the same connection socket to the db being reused by multiple greenlets.

Repro:

import gevent
from gevent import monkey
monkey.patch_all()

from common import db
d = db.DB()
assert d.Connect()   # creates *one* connection handle

a = gevent.spawn(d.Select, "select benchmark(10000000, md5('test'));")
b = gevent.spawn(d.Select, "select 0;")
gevent.joinall([a,b])
% python x.py
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/gevent/greenlet.py", line 327, in run
    result = self._run(*self.args, **self.kwargs)
[..]
  File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 1327, in cursor
    if not self.is_connected():
  File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 877, in is_connected
    self.cmd_ping()
  File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 831, in cmd_ping
    return self._handle_ok(self._send_cmd(ServerCmd.PING))
  File "/usr/lib/python2.6/site-packages/mysql/connector/connection.py", line 475, in _send_cmd
    return self._socket.recv()
  File "/usr/lib/python2.6/site-packages/mysql/connector/network.py", line 190, in recv_plain
    packet = self.sock.recv(1)
  File "/usr/lib64/python2.6/site-packages/gevent/socket.py", line 392, in recv
    self._wait(self._read_event)
  File "/usr/lib64/python2.6/site-packages/gevent/socket.py", line 292, in _wait
    assert watcher.callback is None, 'This socket is already used by another greenlet: %r' % (watcher.callback, )
AssertionError: This socket is already used by another greenlet: <bound method Waiter.switch of <gevent.hub.Waiter object at 0x22a6fa0>>
<Greenlet at 0x2235eb0: <bound method DB.Select of <src.common.db.DB object at 0x7fed379ebe50>>('select 0;')> failed with AssertionError

I'm pretty clear now on what's happening, I can fix it with a simple connection pool as suggested. Thanks for all your help @tilgovi and @benoitc !