zhafen / linefinder

A tool for finding and classifying the worldlines of Lagrangian parcels of mass, in the context of hydrodynamic simulations of galaxy formation.
https://zhafen.github.io/linefinder
MIT License
0 stars 1 forks source link

Unexpected Memory Usage #1

Closed zhafen closed 6 years ago

zhafen commented 7 years ago

Originally reported by Zachary Hafen (Bitbucket: zhafen, GitHub: zhafen)


Running galaxy finding uses up a large amount of memory, sometimes enough that an instance uses up more that 11 GB. This usually happens at the step listed below in the stack trace.

Recent versions use up more memory than previous versions, which previously used <10 GB.

#!python

Traceback (most recent call last):
  File "./run_worldline.py", line 113, in <module>
    particle_track_gal_finder.find_galaxies_for_particle_tracks()
  File "/home1/03057/zhafen/repos/worldline/worldline/galaxy_find.py", line 100, in find_galaxies_for_particle_tracks
    galaxy_and_halo_ids = galaxy_finder.find_ids()
  File "/home1/03057/zhafen/repos/worldline/worldline/galaxy_find.py", line 287, in find_ids
    galaxy_and_halo_ids['host_gal_id'] = self.find_host_id( self.kwargs['galaxy_cut'] )
  File "/home1/03057/zhafen/repos/worldline/worldline/galaxy_find.py", line 314, in find_host_id
    halo_id = self.find_halo_id( radial_cut_fraction )
  File "/home1/03057/zhafen/repos/worldline/worldline/galaxy_find.py", line 389, in find_halo_id
    halo_id = arg_extremum_fn( tiled_m_vir_ma, axis=1 )
  File "/home1/03057/zhafen/.local/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 1019, in argmin
    return _wrapfunc(a, 'argmin', axis=axis, out=out)
  File "/home1/03057/zhafen/.local/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
    return getattr(obj, method)(*args, **kwds)
  File "/home1/03057/zhafen/.local/lib/python2.7/site-packages/numpy/ma/core.py", line 5322, in argmin
    d = self.filled(fill_value).view(ndarray)
  File "/home1/03057/zhafen/.local/lib/python2.7/site-packages/numpy/ma/core.py", line 3698, in filled
    result = self._data.copy('K')
MemoryError

zhafen commented 7 years ago

Original comment by Zachary Hafen (Bitbucket: zhafen, GitHub: zhafen)


Issue #5 might be related to this issue. I implemented Issue #5's solution in this code too, and I need to check if this solved the issue.

It's possible the root of the problem is when, as part of saving the args, I do self.self = self, simply because that seems like that could be bad code.

zhafen commented 7 years ago

Original comment by Zachary Hafen (Bitbucket: zhafen, GitHub: zhafen)


Issue #3 was marked as a duplicate of this issue.

zhafen commented 7 years ago

Original comment by Zachary Hafen (Bitbucket: zhafen, GitHub: zhafen)


I reduced memory and CPU time by having the galaxy finding search for less types of IDs.

In addition, we only find the distance to halos that have a certain minimum amount of stars. However, this doesn't seem to change much.

Putting this on hold for now because the pipeline's still functional without it.

zhafen commented 6 years ago

This is not a priority anymore, especially because we should be using a KDtree probably when doing galaxy_finding.