sameersbn / docker-gitlab

Dockerized GitLab
http://www.damagehead.com/docker-gitlab/
MIT License
7.87k stars 2.14k forks source link

Inconsistent performance due to frequent unicorn worker restarts #1347

Open chaosversum opened 7 years ago

chaosversum commented 7 years ago

I experienced some strange slow downs while navigating the gitlab web ui. While most of the requests are served in under 1 second, 20-30% of the requests needed 4-5 seconds to be served.

I digged a bit into it and found that the the unicorn worker restarts almost every 1-3 requests with just me navigating the web ui and 2 runners locking for jobs. I could even reproduced the problem on a fresh container with 9.5.3 by checking the unicorn.stderr.log while navigating the web ui.

Researching for a while gave me some more evidence that the frequent worker restarts might be the problem: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/2421

Unicorn workers while idle:

PID ELAPSED RSS COMMAND 583 02:36:41 327872 unicorn_rails master -c /home/git/gitlab/config/unicorn.rb -E production 671 02:36:19 385752 sidekiq 5.0.4 gitlab [0 of 25 busy] 6534 23:15 324424 unicorn_rails worker[1] -c /home/git/gitlab/config/unicorn.rb -E production 6548 23:03 324300 unicorn_rails worker[2] -c /home/git/gitlab/config/unicorn.rb -E production 6597 22:45 321052 unicorn_rails worker[0] -c /home/git/gitlab/config/unicorn.rb -E production

Unicorn workers after 3 web ui request (all restarted):

PID ELAPSED RSS COMMAND 583 02:38:46 328028 unicorn_rails master -c /home/git/gitlab/config/unicorn.rb -E production 671 02:38:24 385752 sidekiq 5.0.4 gitlab [0 of 25 busy] 7474 00:13 362032 unicorn_rails worker[1] -c /home/git/gitlab/config/unicorn.rb -E production 7488 00:08 352336 unicorn_rails worker[2] -c /home/git/gitlab/config/unicorn.rb -E production 7502 00:01 313600 unicorn_rails worker[0] -c /home/git/gitlab/config/unicorn.rb -E production

After some more research, i decided to increase the worker memory limit by building my own image from the source and adding the following lines to the unicorn.rb:

ENV['GITLAB_UNICORN_MEMORY_MIN'] = "629145600" ENV['GITLAB_UNICORN_MEMORY_MAX'] = "734003200"

That results in a much smoother experience while navigating the web ui on an almost fresh 9.5.3 container. The workers no longer restarts all the time and the response times are much more stable.

Unicorn workers after launch:

PID ELAPSED RSS COMMAND 601 06:24 320224 unicorn_rails master -c /home/git/gitlab/config/unicorn.rb -E production
691 06:02 402704 sidekiq 5.0.4 gitlab [0 of 25 busy]
695 05:53 404116 unicorn_rails worker[0] -c /home/git/gitlab/config/unicorn.rb -E production
698 05:53 404528 unicorn_rails worker[1] -c /home/git/gitlab/config/unicorn.rb -E production
701 05:53 403504 unicorn_rails worker[2] -c /home/git/gitlab/config/unicorn.rb -E production

Unicorn workers after 50+ web ui requests:

PID ELAPSED RSS COMMAND 601 49:46 323320 unicorn_rails master -c /home/git/gitlab/config/unicorn.rb -E production
691 49:24 436396 sidekiq 5.0.4 gitlab [0 of 25 busy]
695 49:15 418644 unicorn_rails worker[0] -c /home/git/gitlab/config/unicorn.rb -E production
698 49:15 422732 unicorn_rails worker[1] -c /home/git/gitlab/config/unicorn.rb -E production
701 49:15 428004 unicorn_rails worker[2] -c /home/git/gitlab/config/unicorn.rb -E production

Later, i realised that i can change the worker memory limit by setting the GITLAB_UNICORN_MEMORY_MAX environment variable.

Finally, after many hours of research, i fixed the problem in our production environment by increasing the worker memory limit to 500 MB by adding the following environment variable to my docker-compose file:

GITLAB_UNICORN_MEMORY_MAX=524288000

It would be nice to have some additional documentation about these variables or even better, some more decent defaults, as the problem exists on a fresh container as well.

phenomax commented 7 years ago

I agree with you. After applying your recommended changes, my GitLab seems to run much smoother than before.

As you can see here the default values for min/max memory are 300/350MB. All in all, we should definetly consider creating a docker environment variable.

jeinwag commented 6 years ago

Thanks a bunch for figuring this out! Gitlab performance was giving us severe headaches since the update to 9.5.x, after increasing the unicorn worker memory limit performance is fine again.

lekoder commented 6 years ago

@chaosversum Great job! Shouldn't this be a default with 9.5.x?

kiview commented 6 years ago

It seems this settings solved our performance problems on 9.5.4 as well, thanks for the suggestion!

chihkaiyu commented 6 years ago

Hi all, After a few hours experiment, I've found that it doesn't work at all. I set an environment variable GITLAB_UNICORN_MEMORY_MAX=524288000, but I still got error message like

W, [2017-11-22T11:46:54.396973 #833]  WARN -- : #<Unicorn::HttpServer:0x00000000012dde70>: worker (pid: 833) exceeds memory limit (458501632.0 bytes > 334369369 bytes)
W, [2017-11-22T11:46:54.397062 #833]  WARN -- : Unicorn::WorkerKiller send SIGQUIT (pid: 833) alive: 65 sec (trial 1)
I, [2017-11-22T11:46:54.641357 #703]  INFO -- : reaped #<Process::Status: pid 833 exit 0> worker=1
I, [2017-11-22T11:46:54.695003 #990]  INFO -- : worker=1 ready
W, [2017-11-22T11:47:11.245254 #839]  WARN -- : #<Unicorn::HttpServer:0x00000000012dde70>: worker (pid: 839) exceeds memory limit (464723456.0 bytes > 330745035 bytes)
W, [2017-11-22T11:47:11.245334 #839]  WARN -- : Unicorn::WorkerKiller send SIGQUIT (pid: 839) alive: 93 sec (trial 1)
I, [2017-11-22T11:47:11.501420 #703]  INFO -- : reaped #<Process::Status: pid 839 exit 0> worker=3
I, [2017-11-22T11:47:11.553377 #1010]  INFO -- : worker=3 ready

So I did a little experiment and found that I can't get environment variables in config.ru. That is, in config.ru, the following two lines can't access ENV['GITLAB_UNICORN_MEMORY_MIN']

    min = (ENV['GITLAB_UNICORN_MEMORY_MIN'] || 300 * 1 << 20).to_i
    max = (ENV['GITLAB_UNICORN_MEMORY_MAX'] || 350 * 1 << 20).to_i

I don't think this is fixed even you set up an environment variable.

My workaround is that I'd revised config.ru directly and restart unicorn via supervisord.

baracoder commented 6 years ago

Strange, it seams that the issue was gone after setting the docker option -e GITLAB_UNICORN_MEMORY_MAX=524288000. Now on 10.1.1 I see the log messages again and the load on the machine is high.

root@git:/home/git/gitlab# env|grep GITLAB_UNICORN_MEMORY_MAX
GITLAB_UNICORN_MEMORY_MAX=524288000

root@git:/home/git/gitlab# tail -f /var/log/gitlab/gitlab/unicorn.stderr.log
I, [2017-11-22T11:11:25.584536 #673]  INFO -- : reaped #<Process::Status: pid 6702 exit 0> worker=1
I, [2017-11-22T11:11:27.773492 #6740]  INFO -- : worker=1 ready
W, [2017-11-22T11:11:50.831465 #6740]  WARN -- : #<Unicorn::HttpServer:0x000000000140a320>: worker (pid: 6740) exceeds memory limit (508262912.0 bytes > 422973424 bytes)
W, [2017-11-22T11:11:50.831803 #6740]  WARN -- : Unicorn::WorkerKiller send SIGQUIT (pid: 6740) alive: 15 sec (trial 1)
I, [2017-11-22T11:11:51.700052 #673]  INFO -- : reaped #<Process::Status: pid 6740 exit 0> worker=1
I, [2017-11-22T11:11:53.717004 #6757]  INFO -- : worker=1 ready
W, [2017-11-22T11:12:13.749960 #6757]  WARN -- : #<Unicorn::HttpServer:0x000000000140a320>: worker (pid: 6757) exceeds memory limit (494220800.0 bytes > 363311097 bytes)
W, [2017-11-22T11:12:13.750252 #6757]  WARN -- : Unicorn::WorkerKiller send SIGQUIT (pid: 6757) alive: 14 sec (trial 1)
I, [2017-11-22T11:12:14.839219 #673]  INFO -- : reaped #<Process::Status: pid 6757 exit 0> worker=1
I, [2017-11-22T11:12:16.958801 #6774]  INFO -- : worker=1 ready

config.ru still contains reference to the environment variables, but it looks like the variable is ignored. Did something change in how the process is started or environment is passed?

mikehaertl commented 6 years ago

Same for me, the fix only helped for some time. But now it's slow again :cry:.

Any ideas how to reliably fix this in newer releases?

nixel2007 commented 6 years ago

@mikehaertl in our tests Gitlab 10.1+ consumes more memory (~4GB), than previous versions (~2GB). So we had to increase the RAM size on server.

mikehaertl commented 6 years ago

Seriously? This is so hard to believe as our Gitlab installation is really rather small (some hundred issues with a couple of users and groups).

Is there no way to limit this again? Everything really worked well some releases ago. And I can't see any new feature that would justify such an increase in RAM required.

chaosversum commented 6 years ago

I'm currently running gitlab 10.1.4 with GITLAB_UNICORN_MEMORY_MAX=629145600 just fine.

Maybe gitlab 10.1 is using even more initial memory per worker than before, so you have to increase the setting or the workers keep restarting all the time.

Seems to be a memory leaking issue that is gitlab/ruby related.

mikehaertl commented 6 years ago

@chaosversum Oh, right, thanks. Didn't consider to increase the max setting even further. It seems to improve the situation.

lekoder commented 6 years ago

Perhaps it would be prudent to increase default value again.

ghost commented 6 years ago

How would this be set in gitlab.rb for non-docker installs?

mikehaertl commented 6 years ago

I'm on 10.3.0 now and things are incredibly slow again :(. This time I can't see any entries in unicorn.stderr.log. So any further ideas?

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any activity for the last 60 days. It will be closed if no further activity occurs during the next 7 days. Thank you for your contributions.