rsmills36 / pyftpdlib

Automatically exported from code.google.com/p/pyftpdlib
Other
0 stars 0 forks source link

memory leak in Multiprocess mode #263

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. python multi_proc_ftp.py
2. python ftp_test.py (ftp.login, ftp.pwd, ftp.quit)
3. after a few minutes & run `top` cmd

What is the expected output?

What do you see instead?

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                            
 4821 root      20   0  496m 446m 3640 R   46 11.3  15:02.19 python

What version of pyftpdlib are you using? On what operating system? Which Python 
version?

pyftpdlib 1.2.0
CPython 2.7.3
Fedora 18 x86_64

Please provide any additional information below.

Original issue reported on code.google.com by timebug....@gmail.com on 12 Jun 2013 at 1:59

Attachments:

GoogleCodeExporter commented 9 years ago
Before the memory leak I get an IOError: [Errno 24] Too many open files.

After setting the number of open files to unlimited I can reproduce the issue.

I think both things may be related.

Original comment by useboxnet on 13 Jun 2013 at 7:53

GoogleCodeExporter commented 9 years ago
I forgot:

pyftpdlib 1.2.0
Python 2.7.3
Ubuntu 12.04 (Precise) x86_64

I can reproduce the problem in Debian Sqeeze with Python 2.6.6.

Original comment by useboxnet on 13 Jun 2013 at 8:07

GoogleCodeExporter commented 9 years ago
I've found the file descriptor leak, but it didn't fix the memory leak I'm 
afraid.

I'm attaching a patch, I hope it helps. Please read it for details.

Thanks!

Original comment by useboxnet on 13 Jun 2013 at 11:11

Attachments:

GoogleCodeExporter commented 9 years ago
Ah, the patch doesn't work with the thread based server. Besides the descriptor 
leak is not present in the threaded server because there's not fork() (and the 
descriptor is not duplicated).

I've attached a new patch.

You may wan to implement it in a different way but because the task needs to 
start before we can release resources in the main process, I've used the pid 
attribute of Process (not available in Thread) to detect if a process was 
spawned.

Original comment by useboxnet on 13 Jun 2013 at 12:10

Attachments:

GoogleCodeExporter commented 9 years ago
I've found the memory leak!

_SpawnerBase._active_tasks grows forever. The _active_tasks.remove in 
_SpawnerBase._loop won't work with the multiprocessing server because is being 
run in a different process.

The attached patch fixed the issue by removing the finished tasks before adding
a new one *in the main process* (instead of delegating it to one of the
spawned tasks).

Includes the file descriptor leak fix (may not be a good idea but it is a 3 
lines patch).

Regards,

Juan

Original comment by useboxnet on 13 Jun 2013 at 1:00

Attachments:

GoogleCodeExporter commented 9 years ago
Oh, It really works. Thank you very much!

Original comment by timebug....@gmail.com on 13 Jun 2013 at 1:15

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
I patched pyftpdlib in our production systems but I'd be nice to see this fixed 
as it's being used by other projects (such as ftpcloufs) and these are two 
pretty bad leaks :)

EDIT: re-post, sent the previous comment by mistake

Original comment by useboxnet on 16 Jul 2013 at 1:28

GoogleCodeExporter commented 9 years ago
Sorry for replying so late, at the time this was first submitted I was 
traveling and then I forgot about this issue.
Juan I committed your patch as as of r1229.
I hope I will be able to produce a new release soon as this is kind of high 
priority.

Original comment by g.rodola on 16 Jul 2013 at 2:31

GoogleCodeExporter commented 9 years ago
Thanks a lot Giampaolo.

Original comment by useboxnet on 16 Jul 2013 at 2:35

GoogleCodeExporter commented 9 years ago

Original comment by g.rodola on 8 Nov 2013 at 7:52