google / transitfeed

A Python library for reading, validating, and writing transit schedule information in the GTFS format.
https://github.com/google/transitfeed/wiki
Apache License 2.0
679 stars 254 forks source link

Memory Leak in transitfeed Python module #354

Open bdferris opened 9 years ago

bdferris commented 9 years ago

From liquidsu...@gmail.com on November 30, 2012 09:48:57

The following code results in a memory leak in the transitfeed library when processing multiple GTFS feeds (200+, but it's noticeable after just a few) in a loop.  A full script to exhibit the behavior is attached.

  for f in os.listdir("./gtfs/"):     loader = transitfeed.Loader(feed_path="./gtfs/"+f, memory_db=False)     schedule = loader.Load()         #del schedule     #del loader

The following things have been attempted unsuccessfully to eliminate the leak:

  1. Explicitly "del"-ing the objects (uncomment above code).
  2. Creating the loader object with "memory_db=False"
  3. Using the "map" function instead of a loop.

Since a schedule object creates a sqlite3 database and connection (in memory or a tempfile), my guess is that the connection is not being closed properly.  However, the del method for Schedule does seem to attempt to do this... What version of the product are you using? On what operating system? - I have tested this with transitfeed 1.2.12 (both from SVN and PIP).

Original issue: http://code.google.com/p/googletransitdatafeed/issues/detail?id=354

bdferris commented 9 years ago

From bdfer...@google.com on September 26, 2014 09:51:53

(No comment was entered for this change.)

Labels: Language-Python