Closed chrisnatali closed 9 years ago
If this is a priority, we can probably reduce memory consumption by using a kdtree for nearest neighbor lookup rather than a distance matrix.
fixed in release 0.0.3
Still an issue with large input networks. Need to reproduce. May want to re-implement nearest-neighbor lookup via kdtree as suggested above.
Traceback from run on modelrunner:
2015-08-04 12:47:32,737 : Sequencer [INFO] : Traversing The Input Network and Computing Decision Frontier
Traceback (most recent call last):
File "/home/mr/modelrunner/models/mvmax_sequencer.py", line 27, in
The culprit appears to be here:
Expanding the adjacency matrix from sparse to a dense numpy ndarray consumes too much memory. This is hit quite often and likely also represents a performance hit.
fixed in v0.0.5
Copying from https://github.com/SEL-Columbia/modelrunner/issues/23
From @Naigege
When kedco-all zip file run on model runner, it gives a memory error shown as below. The uploaded zip file is 10mb.
reading input from /home/mr/model_runner/worker_data/011a8c4b-2f0a-439d-bffb-1faa030b5b60/input discarding /home/mr/miniconda/envs/model_runner/bin from PATH prepending /home/mr/miniconda/envs/sequencer/bin to PATH 2015-04-30 20:40:38,011 : NetworkPlan [INFO] : Asserting Input Projections Match 2015-04-30 20:40:41,108 : NetworkPlan [INFO] : Aligning Network Nodes With Input Metrics /home/mr/miniconda/envs/sequencer/lib/python2.7/site-packages/sequencer/Utils.py:61: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy fake_nodes['m_coords'] = fake_nodes['m_coords'].apply(lambda x: ()) 2015-04-30 20:41:02,613 : NetworkPlan [INFO] : Computing Pairwise Distances 2015-04-30 20:41:02,613 : NetworkPlan [INFO] : Using haversine Distance Traceback (most recent call last): File "/home/mr/model_runner/scripts/mvmax_sequencer.py", line 19, in nwp = NetworkPlan(shp_file, csv_file, prioritize='Population') File "/home/mr/miniconda/envs/sequencer/lib/python2.7/site-packages/sequencer/NetworkPlan.py", line 58, in init self.distance_matrix = self._distance_matrix() File "/home/mr/miniconda/envs/sequencer/lib/python2.7/site-packages/sequencer/NetworkPlan.py", line 97, in _distance_matrix return np.vstack(map(haversine, coords)) File "/home/mr/miniconda/envs/sequencer/lib/python2.7/site-packages/numpy/core/shape_base.py", line 228, in vstack return _nx.concatenate([atleast_2d(_m) for _m in tup], 0) MemoryError