UDST / synthpop

Synthetic populations from census data
BSD 3-Clause "New" or "Revised" License
100 stars 46 forks source link

Memory Error - when executing demos/synthesize.py #44

Closed wniroshan closed 6 years ago

wniroshan commented 6 years ago

Hello,

I am trying to run synthesize.py in the demos for 1 county but the program fails because of a memory error. My computer has 8GB RAM. Is there a minimum memory requirement?

Below is the stack trace.

synthpop-master/demos$ python synthesize.py "CA" "Santa Clara County"

Synthesizing at geog level: 'block_group' (number of geographies is 1075)
Synthesizing geog id:
 state              06
county            085
tract          500100
block group         1
dtype: object
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/frame.py", line 5708, in _reduce
    values = self.values
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 3811, in values
    return self.as_matrix()
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 3790, in as_matrix
    self._consolidate_inplace()
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 3677, in _consolidate_inplace
    self._protect_consolidate(f)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 3666, in _protect_consolidate
    result = f()
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 3675, in f
    self._data = self._data.consolidate()
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 3826, in consolidate
    bm._consolidate_inplace()
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 3831, in _consolidate_inplace
    self.blocks = tuple(_consolidate(self.blocks))
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 4853, in _consolidate
    _can_consolidate=_can_consolidate)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 4876, in _merge_blocks
    new_values = new_values[argsort]
MemoryError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "synthesize.py", line 24, in <module>
    households, people, fit_quality = synthesize_all(starter, indexes=indexes)
  File "/usr/local/lib/python3.5/dist-packages/SynthPop-0.1.dev0-py3.5.egg/synthpop/synthesizer.py", line 142, in synthesize_all
    hh_index_start=hh_index_start)
  File "/usr/local/lib/python3.5/dist-packages/SynthPop-0.1.dev0-py3.5.egg/synthpop/synthesizer.py", line 64, in synthesize
    h_jd.cat_id)
  File "/usr/local/lib/python3.5/dist-packages/SynthPop-0.1.dev0-py3.5.egg/synthpop/categorizer.py", line 116, in frequency_tables
    household_cat_ids)
  File "/usr/local/lib/python3.5/dist-packages/SynthPop-0.1.dev0-py3.5.egg/synthpop/categorizer.py", line 103, in _frequency_table
    assert df.sum().sum() == len(sample_df)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 7295, in stat_func
    numeric_only=numeric_only, min_count=min_count)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/frame.py", line 5733, in _reduce
    data = self._get_numeric_data()
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py", line 3745, in _get_numeric_data
    self._data.get_numeric_data()).__finalize__(self)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 3587, in get_numeric_data
    self._consolidate_inplace()
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 3831, in _consolidate_inplace
    self.blocks = tuple(_consolidate(self.blocks))
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 4853, in _consolidate
    _can_consolidate=_can_consolidate)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 4873, in _merge_blocks
    new_values = _vstack([b.values for b in blocks], dtype)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/internals.py", line 4919, in _vstack
    return np.vstack(to_stack)
  File "/usr/lib/python3/dist-packages/numpy/core/shape_base.py", line 230, in vstack
    return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
MemoryError

Thanks for the help

cvanoli commented 6 years ago

Hello, Could you tell me which python version are you using? Thanks

wniroshan commented 6 years ago

Hi, I was able to free up some memory and its running now. Thanks

cvanoli commented 6 years ago

Great! We will look into it and try to solve the memory issues.