djberg96 / sys-proctable

A cross-platform Ruby interface for gathering process information on your operating system
Apache License 2.0
150 stars 33 forks source link

[linux] Reduce object allocations for smaps parse #KindaHacky #68

Closed NickLaMuro closed 6 years ago

NickLaMuro commented 6 years ago

:warning: This change definitely is risky, and warrants caution before merging. :warning:

This was the best change that I could come up with for reducing the number of memory allocations on linux, without doing something like making smaps optional or lazy loaded. More details below.

Description

This change takes advantage of a few things in ruby and how the smaps file is structured to reduce the object allocations for a single process* on linux. These are:

Gains of course are also applicable for .ps(nil) as well

The current implementation of this will always have to generate intermediate strings for the regexp matches, which seems to be 2 per match (haven't counted though).

That said, while this method has less allocations, it doesn't seem to perform noticeably faster, and it technically taking advantage of a ruby quirk, which adds risk. That alone might be grounds for not wanting to merge this change, which I can understand.

Benchmarks

Object Allocations

Benchmarks taken by doing the following:

$ irb -r memory_profiler
irb> pid = Process.pid
irb> MemoryProfiler.report { Sys::ProcTable.ps(pid) }.pretty_print

Results:

Total Objects allocated
pre-patch 3920
post-patch 2473

Note: By removing calling Smaps.new all together dropped the allocations down to 1122.

IPS

Munged results from all four invocations of the bench/bench_ips_ps script. There is no change, but put the results here for full disclosure:

Warming up --------------------------------------
Block form - pre patch
                         2.000  i/100ms
Non-block form - pre patch
                         2.000  i/100ms
Block form - post patch
                         2.000  i/100ms
Non-block form - post patch
                         2.000  i/100ms
Calculating -------------------------------------
Block form - pre patch
                         21.382  (± 9.4%) i/s -    106.000  in   5.012562s
Non-block form - pre patch
                         21.518  (± 9.3%) i/s -    108.000  in   5.072667s
Block form - post patch
                         21.302  (± 9.4%) i/s -    106.000  in   5.014568s
Non-block form - post patch
                         22.130  (±13.6%) i/s -    102.000  in   5.055741s

Comparison:
Non-block form - post patch:       22.1 i/s
Non-block form - pre patch:       21.5 i/s - same-ish: difference falls within error
Block form - pre patch:       21.4 i/s - same-ish: difference falls within error
Block form - post patch:       21.3 i/s - same-ish: difference falls within error

Regarding the lack of a speed increase, I am willing to bet that the time to iterate through of the line items in the smaps file is more of a bottleneck than the difference between line.starts_with? and /some regexp/ === line.

djberg96 commented 6 years ago

@NickLaMuro Let's modify the method to either ps(pid, options = {}) or just ps(options = {}). Then we can better control everything. I'll make it a major release.

With that change then users can just do ps(:smaps => false)

Of course, that change and your suggestion are not mutually exclusive.

djberg96 commented 6 years ago

@NickLaMuro With the ability to just disable smaps completely, do you still want to pursue this?

NickLaMuro commented 6 years ago

@djberg96 Will have to review everything that you did in the 1.2.0 changes, but this might be the case that we no longer need/want this, yes.

djberg96 commented 6 years ago

@NickLaMuro Since users can now explicitly disable smaps and other data collection, I think I'll go ahead and close this.