sqlalchemy / sqlalchemy

The Database Toolkit for Python
https://www.sqlalchemy.org
MIT License
8.9k stars 1.34k forks source link

Memory leak on querying orm mapped objects #1308

Closed sqlalchemy-bot closed 15 years ago

sqlalchemy-bot commented 15 years ago

Migrated issue, originally created by Anonymous

I have noticed a memory leak in sqlalchemy queries. Consider the following code:

# PREPERATIONS
import sqlalchemy
print sqlalchemy.__version__ (prints 0.5.2)

from sqlalchemy import *
from sqlalchemy.orm import *
engine = create_engine(...) (Oracle engine)
metadata = Metadata()
metadata.bind = engine
Session = scoped_session(sessionmaker(bind = engine, autoflush = True, autocommit = False))

# TABLES
runnables_table = Table("runnables", metadata, autoload = True, schema = ...)

# POJOS
class Runnable(object):
 pass

# MAPPERS
mapper(Runnable, runnables_table)

def select_a_lot(count, session):
 for i in xrange(count):
  session.query(Runnable).first()

# MAIN

s = Session()
select_a_lot(1000, s)

# MAKE SURE EVERYTHING IS COLLECTED
import gc
print gc.collect() (prints 675)
print gc.collect() (prints 0)
print gc.collect() (prints 0)
print gc.collect() (prints 0)

list_of_runnables = [for o in gc.get_objects() if type(o) == Runnable](o)
print len(list_of_runnables) (prints 0, as it should, because of the weakref change in behaviour in 0.4.x)

# BUG?
list_of_selects = [for o in gc.get_objects() if type(o) == sqlalchemy.sql.expression.Select](o) (prints 2000!!!!!!)

Why aren't all the Select objects collected? I don't need them, they are irrelevant. And besides, they were generated out of my scope.

I also tried clearing the session, and closing the session, but both didn't fix my problem. I did collect (after setting the earlier list to None), and it still didn't collect those objects.

I think it's a real, problematic memory leak, and I'm suffering from it (in my real program) because I'm selecting a lot. Please help me.

sqlalchemy-bot commented 15 years ago

Anonymous wrote:

BTW, I'm quite sure that there are a lot of objects that stays in the memory that are connected somehow to the query - but I focused on "Select" objects, because it's easier to see it in those.

For example, I did a lot of selects in my real program, and saw that the "Column" objects count rise. But I believe it's caused by the same memory leak.

sqlalchemy-bot commented 15 years ago

Michael Bayer (@zzzeek) wrote:

Suspecting a memory leak is not something we take casually. But you'll have to provide a actual steps to reproduce - as it stands I've no idea what code you've actually tested with since the test case you've posted doesn't even run (you've misspelled "MetaData" for example), nor could it runnable by anyone other than yourself since you're autoloading a table that only you have access to. We in fact have test cases for memory leaks in our unit test suite.

Here's my fix of your test case, no memory leak:

import sqlalchemy
print sqlalchemy.__version__

from sqlalchemy import *
from sqlalchemy.orm import *
engine = create_engine('sqlite://')
metadata = MetaData()
metadata.bind = engine

Session = scoped_session(sessionmaker(bind = engine, autoflush = True,
autocommit = False))

# TABLES
runnables_table = Table("runnables", metadata, 
    Column('id', Integer, primary_key=True),
    Column('name', String(50))
)

metadata.create_all()
engine.execute(runnables_table.insert(), [% x} for x in xrange(1000)]({'name':'%d'))

# POJOS
class Runnable(object):
    pass

# MAPPERS
mapper(Runnable, runnables_table)

def select_a_lot(count, session):
    for i in xrange(count):
        session.query(Runnable).first()

# MAIN

s = Session()
select_a_lot(1000, s)

# MAKE SURE EVERYTHING IS COLLECTED
import gc
print gc.collect()
print gc.collect()
print gc.collect()
print gc.collect()

list_of_runnables = [for o in gc.get_objects() if type(o) == Runnable](o)
print len(list_of_runnables)

# BUG?
list_of_selects = [for o in gc.get_objects() if type(o) ==sqlalchemy.sql.expression.Select](o) 
print len(list_of_selects)

output:

z-eeks-Computer-3:sqlalchemy classic$ python test.py
0.5.3
656
0
0
0
0
0
sqlalchemy-bot commented 15 years ago

Michael Bayer (@zzzeek) wrote:

verified no leak with the given script on oracle as well, 0.5.2 as well as trunk. have you located your issue ?

sqlalchemy-bot commented 15 years ago

Anonymous wrote:

Two weeks ago I found out that this is a memory leak bug in python2.4.

It only appears in linux when using sqlalchemy and cx_Oracle (with no relation to the sqlalchemy version or the cx_Oracle version).

Also, the memory leak problem was fixed between 2.4.1 and 2.4.2. This is not relevant, I suppose - supporting python2.4.1 or less.

sqlalchemy-bot commented 15 years ago

Anonymous wrote:

kobipe3@gmail.com

sqlalchemy-bot commented 15 years ago

Changes by Michael Bayer (@zzzeek):