arup-group / elara

Command line utility for processing MATSim events output files.
MIT License
14 stars 4 forks source link

late memory deaths #68

Closed fredshone closed 3 years ago

fredshone commented 3 years ago

The handler finalise methods involve initiating pandas dataframe classes (and also geopandas) from some big numpy arrays. At this step there is therefore i) a duplication of information in the arrays and tables and ii) the problem of potentially inefficient info storage in pandas.

This is sometime causing late process memory failures, eg:

11-30 12:19 elara.event_handlers INFO Finalising <class 'elara.event_handlers.VolumeCounts'>
Killed

Likelly three actions: i) delete obsolete data immediately, ii) force some pandas data types to reduce memory and iii) consider removing pandas all togther (although it is super convenient)

fredshone commented 3 years ago

Some effort made to force deletions, however this hasn't helped (python was garbage collecting fine on it's own).

Happilly using bigger boxes now so closing