Closed wdwvt1 closed 8 years ago
I think the issue has something to do with mapping different sink samples to different processors via ipyparallel
. Below is a graph of a simple experiment.
The obscured black line is runtime vs memory consumption for a single sink, no cluster, 10 iterations. The blue line is runtime vs memory consumption for a 2 sinks with a cluster of 2 nodes, 10 iterations. The red line is runtime vs memory consumption for 100 sinks with a cluster of 2 nodes, 10 iterations.
Obviously not conclusive, but you can see the memory jump in steps for red. I see something very similar on longer 100+ sink runs, but significantly more pronounced in terms of memory usage.
Very insightful. You are a supercluster of biome code.
Best, Ajay
Sent from a mobile device
On Aug 16, 2016, at 5:00 PM, Will Van Treuren notifications@github.com wrote:
I think the issue has something to do with mapping different sink samples to different processors via ipyparallel. Below is a graph of a simple experiment.
The obscured black line is runtime vs memory consumption for a single sink, no cluster, 10 iterations. The blue line is runtime vs memory consumption for a 2 sinks with a cluster of 2 nodes, 10 iterations. The red line is runtime vs memory consumption for 100 sinks with a cluster of 2 nodes, 10 iterations. [image: figure_1] https://cloud.githubusercontent.com/assets/1048569/17718736/668b6394-63ca-11e6-97bd-afc4208c92e4.png
Obviously not conclusive, but you can see the memory jump in steps for red. I see something very similar on longer 100+ sink runs, but significantly more pronounced in terms of memory usage.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/biota/sourcetracker2/issues/58#issuecomment-240265668, or mute the thread https://github.com/notifications/unsubscribe-auth/AHTpAsTG2U5JWLq3JJFTDK55FbJMkiKwks5qgkEMgaJpZM4Jl11y .
My hunches were all wrong. After using valgrind
, memory_profiler
, and the python standard tracemalloc
, I figured it out based on guess and check. Removing the Sampler.seq_assignments_to_contingency_table
call solved the problem. I'll update the function in a future commit.
There is a memory leak that has become apparent in long-running simulations. The memory usage of python when running
_gibbs
steadily increases despite there being no clear reason for doing so (the memory is preallocated for results storage ingibbs_sampler
).I believe the error has to do with one of the following:
numpy array
copies that occur and are not garbage collected link1, link2.pandas dataframes
not being correctly garbage collected (might be the same bug as 1) link1, link2, link3.ipyparallel
is running link1.Based on the threads I have read (those linked above) I am guessing that either a bunch of array copies are occurring that are not getting collected, or there is some interaction between the cluster and multiple sinks.