maxhollmann / slab-profiler

Records and analyzes growth of specific slabs and correlates it with running processes
7 stars 1 forks source link

Questions on `analyze.py` #1

Open tilusnet opened 4 years ago

tilusnet commented 4 years ago

Hello! Thanks for your post under https://unix.stackexchange.com/questions/415814/memory-runs-full-over-time-high-buffer-cache-usage-low-available-memory/456688#456688?newreg=86be4a98bf95414cae594bd460a38068. It lead me to this repo of yours.

I have been suspicious about the same behaviour on my machine just as you have, and surprise, I also used xflux! So I have now uninstalled it.

However I am still witnessing slow and steady increase in SUnreclaim under /proc/meminfo so I decided to use your scripts to investigate further. I order to do so efficiently I've got a few questions about your code in analyze.py:

Thanks in advance for your work, and hope to get answers to these questions.

maxhollmann commented 4 years ago

The idea is to start record.py while the process you suspect isn't running and let it record for a while. Then start the process and keep recording for a similar amount of time. analyze.py tries to correlate the growth of the slab with the process being active or not. Since it needs enough data where the process is active and where it's not, it sorts out those where this is not the case with the if not 0.2 < np.mean(running) < 0.8.

As for the coef < np.max(diff) / 3, I'm not 100% sure as it's been a while since I wrote this, but it looks to me like it's removing processes that seem to have a very small correlation to slab growth.

Hope that helps!

tilusnet commented 4 years ago

Many thanks Max. I only started your record.py long after my suspicious process had been running. I'll try next as you suggest.