niwinz / djorm-ext-core

Core module of django orm extensions package. (Collection of third party plugins build in one unified package)
Other
27 stars 8 forks source link

Server side cursors should be used only for SELECT queries #5

Closed scibi closed 10 years ago

scibi commented 10 years ago

Hi!

Right now within with server_side_cursors() block all queries are made with server side cursors. Unfortunately they are valid only for SELECTs (see PostgreSQL documentation).

So something like this doesn't work:

from djorm_core.postgresql import server_side_cursors

with server_side_cursors():
    for item in Model.objects.all():
        print item.value

        m=AnotherModel(sth=item.x)
        m.save()

It will produce queries like this:

DECLARE "curc51db20dcc4e4816abcbc8ad143a4d10" CURSOR WITHOUT HOLD FOR SELECT [..];
DECLARE "cur1c304e46a434453f9be72cf259403dab" CURSOR WITHOUT HOLD FOR INSERT INTO [..];

which will not work - PostgreSQL will report syntax error.

I've fixed it by saving original client side cursor and choosing correct cursor to use based on query. I also added withhold parameter which allows me to make commits inside for loop.

Notes:

scibi commented 10 years ago

This solution works but unfortunately it doesn't solve all issues.

Right now when you have code like this:

from djorm_core.postgresql import server_side_cursors

with server_side_cursors():
    for item in Model.objects.iterator():
        print item.value

        if something:
            m = AnotherModel.objects.get(...)

Server side cursor is created for every get call. As Django doesn't close cursors when they are no longer needed you have a resource leak which causes PostgreSQL to run out of memory. With pull request #6 you can simply specify once = True and it will work fine. Doing so will also address the issue of using server side cursor for data modification queries inside the loops.

niwinz commented 10 years ago

Hi! Thanks for this work in any case.

Server side cursors should be used in transaction, without transaction block, as you said preoviously causes postgresql run out of memory.

This decorator is an "experiment" or "hack" for process large amount of data on manage commands for... do this with large amount of data. Is not tested for active use in views and similar.

SqlAlchemy has better approach for using server side cursors.

scibi commented 10 years ago

This is exactly my use case. I process really big table in management command and do some actions for some rows fetched - create new row in another table or update it.

You can use server side cursors outside transactions but not when you create huge amounts of them ;)

There is a Django bug #16614 I'm going to work on when/if I have some free time. It would be good to have this in core code instead of hack like this.

niwinz commented 10 years ago

Totally right! It would be awesome having this in core.