gearman / gearmand

http://gearman.org/
Other
741 stars 138 forks source link

memory leak #284

Open soul-rise opened 4 years ago

soul-rise commented 4 years ago

Hi. If you queue a large number of large tasks, then run a worker who will perform all the tasks, but the German will not free up memory. Gearman version 1.1.19.1

  1. docker-compose up -d
  2. docker stats gearman
  3. MEM USAGE is 1.441MiB

other window

  1. docker-compose exec hc hs
  2. php s.php // start send task, wait 10s and stop it

i can see MEM USAGE up to 1.596GiB

  1. php w.php
  2. wait when queue will be empty

i can see MEM USAGE is 705MiB gearman_leak.zip

p-alik commented 4 years ago

I guess your worker doesn't consume anything. You provides a function to addFunction but according to documentation

Registers a function name with the job server and specifies a callback corresponding to that function

soul-rise commented 4 years ago

if i check gearadmin --status, i can see, that 0 task in queue after start worker

p-alik commented 4 years ago

I can't run your PHP code and I couldn't reproduce the issue with similar Perl implementation. Is there any evidence gearmand consumes the memory?

soul-rise commented 4 years ago

docker-compose.yml

version: '3'
services:

  #Gearman Service
  gearman:
    image: artefactual/gearmand:1.1.19.1-alpine
    container_name: gearman
    restart: always
    ports:
      - '4730:4730'
  1. docker stats gearman MEM USAGE / LIMIT - 1.223MiB / 1.944GiB
  2. docker-compose exec gearman sh
  3. for i in seq 1 3300; do cat /dev/urandom | tr -dc ‘a-zA-Z0-9′ | head -c500000 | gearman -f test_queue -b; done
  4. gearadmin --status test_queue 3300 0 0
  5. docker stats gearman MEM USAGE / LIMIT - 1.553GiB / 1.944GiB
  6. gearman -w -f test_queue >> /dev/null // dont stop
  7. gearadmin --status test_queue 0 0 1
  8. docker stats gearman MEM USAGE / LIMIT - 1.557GiB / 1.944GiB

when i stop command "gearman -w -f test_queue"- gearman free up memory

Is this the correct work of freeing up memory?

I think, the case may be in the php library, it does not close the connections to german

soul-rise commented 4 years ago

I check connect with case PHP scripts, connection was closed, but memory is partially free

can I collect additional information to help clarify the situation? reproduced in 99% of cases with PHP scripts

p-alik commented 4 years ago

Is this the correct work of freeing up memory?

I would say no. @SpamapS, @esabol it looks like gearmand has little memory leak issue:

$ ps -C gearmand
  PID TTY          TIME CMD
 7060 pts/13   00:00:00 gearmand
$ pmap 7060 | tail -n1
 total           429824K
$ for i in `seq 1 3300`; do cat /dev/urandom | tr -dc ‘a-zA-Z0-9′ | head -c500000 | /opt/devel/Gearman/gearmand/bin/gearman -f test_queue -b; done
$ pmap 7060 | tail -n1
 total          2053952K
$ /opt/devel/Gearman/gearmand/bin/gearadmin --status
test_queue      3300    0       0
.
$ /opt/devel/Gearman/gearmand/bin/gearman -w -f test_queue > /dev/null
^Z
[1]+  Stopped                 /opt/devel/Gearman/gearmand/bin/gearman -w -f test_queue > /dev/null
$ bg
[1]+ /opt/devel/Gearman/gearmand/bin/gearman -w -f test_queue > /dev/null &
$ /opt/devel/Gearman/gearmand/bin/gearadmin --status
test_queue      0       0       1
.
$ pmap 7060 | tail -n1
 total           430352K
$ fg
/opt/devel/Gearman/gearmand/bin/gearman -w -f test_queue > /dev/null
^C
$ pmap 7060 | tail -n1
 total           430352K

Memory usage increased by 528K.

esabol commented 4 years ago

Do you think this could be reproduced with Perl clients/workers, @p-alik ?

Could one of you try running gearmand in valgrind, repeating this test, and posting the valgrind log here? I think there's a valgrind target in the top-level gearmand Makefile.... Maybe that would help in tracking down where in the code the leak is.

esabol commented 4 years ago

Actually, I recommend starting with make valgrind-supressions [sic] for starters.

Refer to my post here: https://github.com/gearman/gearmand/issues/277#issuecomment-591704445

p-alik commented 4 years ago

I'm afraid I can't do it on my local environment:

$ make valgrind --debug
GNU Make 4.1
Built for x86_64-pc-linux-gnu
Copyright (C) 1988-2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Reading makefiles...
Updating goal targets....
 File 'valgrind' does not exist.
Must remake target 'valgrind'.
make check LOG_COMPILER="/opt/devel/Gearman/gearmand/libtool --mode=execute valgrind --tool=memcheck --error-exitcode=1 --leak-check=yes --track-fds=yes --malloc-fill=A5 --free-fill=DE --fullpath-after=."
Successfully remade target file 'valgrind'.

The same result for make valgrind-suppressions. In #177 I did it with ./configure --enable-debug, which doesn't help further.

esabol commented 4 years ago

make valgrind, make valgrind-supressions, and make helgrind only echo commands to run gearmand under valgrind. You need to copy the command that is echo'ed and then execute it.

Or something like that. 😊

p-alik commented 4 years ago

valgrind --leak-check=full --show-reachable=yes --error-limit=no --gen-suppressions=all --log-file=minimalraw.log /opt/devel/Gearman/gearmand/gearmangearmand produced minimalraw.log

SpamapS commented 4 years ago

Very possible that there is a leak, but valgrind will only show it if you actually hit it though. So, try to follow the instructions to get a valgrind version up, and then repeat the client behavior, and see if you can find the leak.

On Wed, Jul 15, 2020 at 11:38 AM Алексей Пастухов notifications@github.com wrote:

valgrind --leak-check=full --show-reachable=yes --error-limit=no --gen-suppressions=all --log-file=minimalraw.log /opt/devel/Gearman/gearmand/gearmangearmand produced minimalraw.log https://github.com/gearman/gearmand/files/4927258/minimalraw.log

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/gearman/gearmand/issues/284#issuecomment-658934754, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADS6YFKVKUN2K2YR42WIDDR3XZRXANCNFSM4OZMW6SA .

p-alik commented 4 years ago

@SpamapS, @esabol, do you have any thoughts on valgrind-output minimalraw.log?

esabol commented 4 years ago

I don't have a lot of experience interpreting valgrind logs, but it seems to be indicating there's a 40-byte leak per call to gearmand_create()? But maybe that's to be expected, I don't know. It's actually allocated by libevent?

But that's no where near the size of the leak that @soul-rise reported. @soul-rise, can you try to run gearmand under valgrind, reproduce your test case, and upload the valgrind log?

esabol commented 4 years ago

This message thread about a libevent memory leak shows a very similar valgrind log (to minimalraw.log):

https://libevent-users.monkey.narkive.com/Qemtm7oP/libevent-memory-leak

The libevent developer there said that event_global_setup_locks_ is a one-time allocation. Also, if libevent were configured to '--disable-debug-mode`, it would eliminate one of the allocations at least.

Anyway, it seems these 40-byte allocations are expected of libevent.

esabol commented 4 years ago

@soul-rise, can you try to run gearmand under valgrind, reproduce your test case, and upload the valgrind log?

@soul-rise, are you willing to help track this down?

esabol commented 4 years ago

This looks like it could be a memory leak to me:

https://github.com/gearman/gearmand/blob/8f58a8b439ee206eec732c50de982b05e2d40b27/libgearman-server/job.cc#L249-L260

Why does it set server_job->data to NULL on line 257 before calling gearman_server_job_free(server_job)?

I could see setting server_job->data to NULL before calling gearman_server_job_free(server_job) if some other structure owns the data pointer (packet perhaps? data seems to originate there) and is responsible for it so as to prevent double freeing, but then why doesn't it do that later in the same function (right before line 277)? It seems to me that data should be handled the same in these two places, either set it to NULL in both places before calling gearman_server_job_free(server_job) or don't.

https://github.com/gearman/gearmand/blob/8f58a8b439ee206eec732c50de982b05e2d40b27/libgearman-server/job.cc#L276-L278