ManageIQ / manageiq

ManageIQ Open-Source Management Platform
https://manageiq.org
Apache License 2.0
1.35k stars 899 forks source link

Research jemalloc for appliances #12819

Closed jrafanie closed 4 years ago

jrafanie commented 8 years ago

tl;dr Seems to save around 10% PSS

[root@localhost vmdb]# cat /etc/centos-release
CentOS Linux release 7.2.1511 (Core)

[root@localhost vmdb]# yum install jemalloc
[root@localhost vmdb]# yum install jemalloc-devel

[root@localhost vmdb]# rpm -qa |grep jemalloc
jemalloc-3.6.0-1.el7.x86_64
jemalloc-devel-3.6.0-1.el7.x86_64

[root@localhost vmdb]# ruby-install ruby-2.3.2 -- --with-jemalloc
...
checking time.h usability... yes
checking time.h presence... yes
checking for time.h... yes
checking ucontext.h usability... yes
checking ucontext.h presence... yes
checking for ucontext.h... yes
checking utime.h usability... yes
checking utime.h presence... yes
checking for utime.h... yes
checking gmp.h usability... no
checking gmp.h presence... no
checking for gmp.h... no
checking for malloc_conf in -ljemalloc... yes
checking jemalloc/jemalloc.h usability... yes
checking jemalloc/jemalloc.h presence... yes
checking for jemalloc/jemalloc.h... yes
...

In /etc/default/evm, change: export PATH=$PATH:/opt/rubies/ruby-2.3.1/bin To: export PATH=$PATH:/opt/rubies/ruby-2.3.2/bin

Very rough memory comparison for some of the workers.

Before, using malloc

[root@localhost vmdb]# smem -krs pss -P "MIQ|puma"
  PID User     Command                         Swap      USS      PSS      RSS
 2927 root     puma 3.3.0 (tcp://127.0.0.1        0   334.0M   339.0M   372.4M
 2942 root     puma 3.3.0 (tcp://127.0.0.1        0   219.4M   230.6M   298.4M
 2903 root     MIQ: MiqPriorityWorker id:         0   195.4M   208.0M   280.9M
 2710 root     MIQ Server                         0   184.7M   201.3M   286.3M
 2894 root     MIQ: MiqGenericWorker id: 1        0   179.8M   194.3M   271.2M
 2912 root     MIQ: MiqScheduleWorker id:         0   175.1M   190.3M   268.5M
 3289 root     MIQ: MiqPriorityWorker id:         0   148.5M   182.2M   281.6M
 3280 root     MIQ: MiqGenericWorker id: 2        0   148.7M   182.1M   281.2M

After, using jemalloc

[root@localhost vmdb]# smem -krs pss -P "MIQ|puma"
  PID User     Command                         Swap      USS      PSS      RSS
 5862 root     puma 3.3.0 (tcp://127.0.0.1        0   260.0M   293.1M   399.7M
 5808 root     MIQ: MiqGenericWorker id: 2        0   210.8M   221.8M   268.8M
 5877 root     puma 3.3.0 (tcp://127.0.0.1        0   164.0M   200.9M   318.3M
 5825 root     MIQ: MiqPriorityWorker id:         0   183.9M   198.9M   281.8M
 5834 root     MIQ: MiqPriorityWorker id:         0   180.5M   195.7M   280.2M
 5658 root     MIQ Server                         0   154.0M   182.8M   296.3M
 5817 root     MIQ: MiqGenericWorker id: 2        0   155.4M   179.6M   265.7M
 5844 root     MIQ: MiqScheduleWorker id:         0   145.9M   172.3M   275.8M

https://bugs.ruby-lang.org/issues/9113 2.2.0 was first release with this added option to configure with jemalloc: https://www.ruby-lang.org/en/news/2014/12/25/ruby-2-2-0-released/

miq-bot commented 7 years ago

This issue has been automatically marked as stale because it has not been updated for at least 6 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions!

NickLaMuro commented 5 years ago

@jrafanie and while I am bothering you, do you still see this as a worthwhile endeavor to pursue?

jrafanie commented 5 years ago

Good question @NickLaMuro. jemalloc is supposed to be so much better and maybe I was benchmarking it incorrectly but it didn't seem to be help us.

https://medium.com/rubyinside/how-we-halved-our-memory-consumption-in-rails-with-jemalloc-86afa4e54aa3

https://brandonhilkert.com/blog/reducing-sidekiq-memory-usage-with-jemalloc/

https://www.mikeperham.com/2018/04/25/taming-rails-memory-bloat/

JPrause commented 5 years ago

@jrafanie does your response mean you want this issue to remain open. If yes, can you remove the stale label.

If there's no update by next week, I'll be closing this issue.

jrafanie commented 5 years ago

Yes, I think this is worthwhile researching since my testing wasn't very scientific and there could be some memory savings using jemalloc on appliances. Removed the stale label.

JPrause commented 5 years ago

Thanks @jrafanie and also for removing the stale label.

djberg96 commented 5 years ago

@jrafanie I thought you saw a 10% reduction. That alone seems worthwhile.

Is there any chance you're already using a version of Ruby with jemalloc enabled?

jrafanie commented 5 years ago

@djberg96 I wasn't seeing the 50% improvement or even 20% that other places on the web were reporting. I'm not saying that it's not an improvement but maybe my testing methodology was flawed.

Are you seeing improvements when you're using jemalloc?

djberg96 commented 5 years ago

I was seeing around a 10% improvement for most things, but the UI seemed to get 30-40%. I think @himdel was seeing around the same thing.

Anyway, I think it would be worth making the default if possible, though I don't know how we would deploy jemalloc to production exactly.

himdel commented 5 years ago

So, it really depends on what we're measuring.

Running rails s, and measuring using what htop says, after having logged in and loaded the dashboards:

$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 bin/rails s
virt 1224M res 699M shr 19220

$ bin/rails s
virt 1688M res 514M shr 18416

Measuring using smem -krs pss -P puma\ \\d:

with libjemalloc2 5.1.0-2: after booting 9153 himdel puma 3.7.1 (tcp://localhost 0 275.2M 276.0M 284.0M on login screen 9153 himdel puma 3.7.1 (tcp://localhost 0 690.1M 690.9M 699.3M dashboard loaded 9153 himdel puma 3.7.1 (tcp://localhost 0 797.3M 798.4M 808.6M

with regular malloc: after booting 10085 himdel puma 3.7.1 (tcp://localhost 0 165.2M 166.0M 173.3M on login screen 10085 himdel puma 3.7.1 (tcp://localhost 0 418.9M 419.6M 427.0M dashboard loaded 10085 himdel puma 3.7.1 (tcp://localhost 0 527.4M 528.4M 538.5M

All of that is ruby 2.4.3p205.

jrafanie commented 5 years ago

@himdel wow, jemalloc looks really bad

himdel commented 5 years ago

I wonder if some of that difference could be LD_PRELOAD vs building ruby with jemalloc, but, apart from virtual, the numbers do seem consistently higher, yeah.

(Or, there's also libjemalloc1 3.6.0-11 in debian, I haven't tried that one.)

jrafanie commented 5 years ago

Yeah, LD_PRELOAD could be at play here. I haven't reviewed the versions of jemalloc to know if one version is better than others.

djberg96 commented 5 years ago

If you're using rbenv, you can do RUBY_CONFIGURE_OPTS="--with-jemalloc" rbenv install 2.4.3 (or whichever version you use) to see if there's any benefit vs using LD_PRELOAD.

You can double check it with ruby -r rbconfig -e "puts RbConfig::CONFIG['LIBS']" and you should see -ljemalloc.

djberg96 commented 5 years ago

My own experiments showed roughly the same thing as @himdel whether Ruby was built with jemalloc or I used LD_PRELOAD.

himdel commented 5 years ago

@djberg96 beat me to it :)

Tried 2.4.4 built with jemalloc, and..

  PID User     Command                         Swap      USS      PSS      RSS 
18680 himdel   puma 3.7.1 (tcp://localhost        0   170.4M   171.1M   177.2M 

18680 himdel   puma 3.7.1 (tcp://localhost        0   370.9M   371.5M   377.7M 

18680 himdel   puma 3.7.1 (tcp://localhost        0   490.0M   490.8M   498.6M 

So.. I have no idea now :D

himdel commented 5 years ago

Though, at least subjectively, it seems I'm getting a bit more of those

^C
[----] I, [2019-02-01T15:24:24.929746 #18680:3ffa7e2d086c]  INFO -- : Completed   in 320417ms (ActiveRecord: 45.8ms)
[----] I, [2019-02-01T15:24:24.931011 #18680:3ffa7e2d05c4]  INFO -- : Completed   in 321260ms (ActiveRecord: 204.4ms)

hangs with the jemalloc version.

(But, hard to tell, it's not like they're not happening every day anyway :), just something we should test if we ever decide to build those)

miq-bot commented 5 years ago

This issue has been automatically marked as stale because it has not been updated for at least 6 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions!

JPrause commented 5 years ago

@miq-bot remove_label stale

JPrause commented 5 years ago

@miq-bot add_label pinned

jrafanie commented 5 years ago

updating the issue with some attempts with rhel8/centos8:

jemalloc 5.2.1 is available with the OS update so I'm going to try it again.

Had to hack a powertools repo installation only for super insecure dev testing on RHEL8:

[root@localhost ~]# cat /etc/yum.repos.d/CentOS-PowerTools.repo
[PowerTools]
name=CentOS $releasever - PowerTools
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=PowerTools&infra=$infra
#baseurl=http://mirror.centos.org/$contentdir/$releasever/PowerTools/$basearch/os/
gpgcheck=0
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-centosofficial

This should work on centos8:

dnf config-manager --enable PowerTools
yum config-manager --set-enabled PowerTools

You can then install libyaml-devel

 yum install libyaml-devel

You can then build/install ruby using ruby-install:

ruby-install --system ruby 2.5.7 -- --disable-install-doc --enable-shared --with-jemalloc
[root@localhost ~]# ruby -r rbconfig -e "puts RbConfig::CONFIG['LIBS']"
-lpthread -ljemalloc -ldl -lcrypt -lm
jrafanie commented 6 months ago

I tried using jemalloc again on mac but will need to really test with it over longer periods on appliances and containers to really see the performance improvements:

ruby-install 3.1 -- --with-openssl-dir=$(brew --prefix openssl@3) --with-jemalloc CPPFLAGS="-I$HOMEBREW_PREFIX/opt/jemalloc/include" LDFLAGS="-L$HOMEBREW_PREFIX/opt/jemalloc/lib"
djberg96 commented 6 months ago

At this point I'd be looking at 3.2 or 3.3 with yjit + jemalloc.

agrare commented 6 months ago

Hey @djberg96 !! :wave: