FSX / momoko

Wraps (asynchronous) Psycopg2 for Tornado.
http://momoko.61924.nl/
Other
363 stars 73 forks source link

Bug fix for large queries #29

Closed stevearc closed 11 years ago

stevearc commented 12 years ago

Commit comments say it all. There's two commits here, one to fix the broken behavior on large queries, and one to handle database connections going down temporarily. If you don't like the implementation feel free to decline the pull request and make the changes however you see fit, but I thought I'd pass on this fix in the hope of preventing other people from running into the same issue.

FSX commented 12 years ago

I have something similar implemented in the rewrite in connection.py#L248 and for reconnectiong at connection.py#L171.

What happens when the connection is closed, is the callback still executed?

stevearc commented 12 years ago

Cool! Wish I'd seen that earlier. In my hackish solution for the connection issue, the callbacks are not called. So if we get a tornado request while the db is down it just hangs and timeouts. Which is...acceptable for now, but I'm looking forward to the rewrite!

Looking at the rewrite, I'm not certain that the issue with the large query hang has been fixed. The original problem was that _io_callback was never being called by tornado. I would attempt an execute, which would call AsyncConnection.cursor() and that would do cursor.execute() as well as call ioloop.update_handler(fd, IOLoop.READ). And in the case of a large query, psycopg2 sets its internal async_status as WRITE to indicate that it requires additional calls to poll() to flush the query to the database. So after the call to cursor(), _io_callback is never called, which means cursor.poll() is never called, and the query is never flushed to the database.

With the code in the rewrite, it seems like inside of _do_op you're calling execute and then calling set_callbacks, which sets the update_handler to IOLoop.READ just like before.

Again, my knowledge of how epoll works is pretty shallow, so I could be totally wrong about how this is working. Note that if you want to test this you will need queries in excess of 60000 characters AND you will need the database to be on a different machine. I was never able to reproduce it locally.

FSX commented 12 years ago

60000 characters? Another machine is not a problem for me, but I can't imagina a query of 60000 characters. What kind of queries are you running?

stevearc commented 12 years ago

Yeah...we're uh...we're inserting a lot of json into the db. I have a simple repro of the bug. Create a basic table in Postgresql with a text field.

CREATE TABLE test_table (my_text TEXT);

Then

from tornado.ioloop import IOLoop
from tornado import gen
import momoko

QUERY_SIZE = 60000

@gen.engine
def do_query():
    rwdb = momoko.AsyncClient({
        "host": "my-host",
        "port": 5432,
        "user": "my-user",
        "password": "my-pass",
        "database": "my-db",
        "min_conn": 1,
        "max_conn": 1,
        "cleanup_timeout": 10
    })
    rwdb.execute("INSERT INTO test_table (my_text) VALUES (%s)", ("A" * QUERY_SIZE,),\
                callback=(yield gen.Callback('test_insert')))
    cur = yield gen.Wait('test_insert')
    print cur.statusmessage
    IOLoop.instance().stop()

def main():
    ioloop = IOLoop.instance()
    ioloop.add_callback(do_query)
    ioloop.start()

if __name__ == "__main__":
    main()

That will hang forever. You can tweak the QUERY_SIZE to see that it functions normally at smaller values.

FSX commented 12 years ago

Thanks! I'll test it with the current Momoko and with the rewrite.

I'll leave the pull request open until I've confirmed it.

FSX commented 11 years ago

Sorry for the delay. I fixed it in ee354c032954598aa8acda9232faa6b79e1ea2d9, but only in the rewrite. I', focusing on that, because I need to use it in an internship project in some weeks.

I did not use a database on an other machine, but it kept hanging like you described when the query exceeded 60000 characters. Can you confirm it works? If you want to try out the rewrite of course. :)

stevearc commented 11 years ago

Tested the rewrite, and it looks like it's working!