The following code results in a memory leak in the transitfeed library when processing multiple GTFS feeds (200+, but it's noticeable after just a few) in a loop. A full script to exhibit the behavior is attached.
for f in os.listdir("./gtfs/"):
loader = transitfeed.Loader(feed_path="./gtfs/"+f, memory_db=False)
schedule = loader.Load()
#del schedule
#del loader
The following things have been attempted unsuccessfully to eliminate the leak:
Explicitly "del"-ing the objects (uncomment above code).
Creating the loader object with "memory_db=False"
Using the "map" function instead of a loop.
Since a schedule object creates a sqlite3 database and connection (in memory or a tempfile), my guess is that the connection is not being closed properly. However, the del method for Schedule does seem to attempt to do this... What version of the product are you using? On what operating system? - I have tested this with transitfeed 1.2.12 (both from SVN and PIP).
I am using Ubuntu 12.10 64-bit with Python 2.7.3 and libsqlite3 3.7.13.
From liquidsu...@gmail.com on November 30, 2012 09:48:57
The following code results in a memory leak in the transitfeed library when processing multiple GTFS feeds (200+, but it's noticeable after just a few) in a loop. A full script to exhibit the behavior is attached.
for f in os.listdir("./gtfs/"): loader = transitfeed.Loader(feed_path="./gtfs/"+f, memory_db=False) schedule = loader.Load() #del schedule #del loader
The following things have been attempted unsuccessfully to eliminate the leak:
Since a schedule object creates a sqlite3 database and connection (in memory or a tempfile), my guess is that the connection is not being closed properly. However, the del method for Schedule does seem to attempt to do this... What version of the product are you using? On what operating system? - I have tested this with transitfeed 1.2.12 (both from SVN and PIP).
Original issue: http://code.google.com/p/googletransitdatafeed/issues/detail?id=354