troglobit / uftpd

FTP/TFTP server for Linux that just works™
https://troglobit.com/projects/uftpd/
ISC License
182 stars 32 forks source link

Huge amount of zombie processes cause system hang #32

Open ArthurDents opened 4 years ago

ArthurDents commented 4 years ago

Hi

I have discovered an issues with the uftpd which causes a huge amount of zombie processes, and eventually I loose contact with the target and a hard reset (power on/off) is required.

My test setup: I have an ftp client that opens up an ftp connection to uftpd, and each 5 sec the client upload a number of log files (the log files are constantly being updated on the target). I do not disconnect between the transfers, i.e. the ftp connection is open during the session. This can run fine for a days/a week, but sometimes there is a build up of zombie process, and each time my ftp client tries to upload files I get more zombie process.

E.g.:

# ps aux | grep defunct
root     14011  0.0  0.0      0     0 ?        Z 13:21   0:00 [uftpd] <defunct> 
root     14012  0.0  0.0      0     0 ?        Z 13:21   0:00 [uftpd] <defunct>
root     14013  0.0  0.0      0     0 ?        Z 13:22   0:00 [uftpd] <defunct>
root     14014  0.0  0.0      0     0 ?        Z 13:22   0:00 [uftpd] <defunct>
root     14024  0.0  0.0      0     0 ?        Z 13:22   0:00 [uftpd] <defunct>
root     14025  0.0  0.0      0     0 ?        Z 13:22   0:00 [uftpd] <defunct>
root     14030  0.0  0.0      0     0 ?        Z 13:22   0:00 [uftpd] <defunct>
root     14031  0.0  0.0      0     0 ?        Z 13:22   0:00 [uftpd] <defunct>
...

From console there are ftp error messages:

...
Failed accepting FTP client connection. Error 11: Resource temporarily unavailable
Failed accepting FTP client connection. Error 11: Resource temporarily unavailable
Failed accepting FTP client connection. Error 11: Resource temporarily unavailable
Failed accepting FTP client connection. Error 11: Resource temporarily unavailable 
...

Eventually there are around 11000-12000 zombie processes, and I loose contact with the unit. E.g. trying to run a simple ls command:

# ls
-sh: fork: retry: Resource temporarily unavailable
-sh: fork: retry: Resource temporarily unavailable
-sh: fork: retry: Resource temporarily unavailable

I haven't figured out what's trigger this. Maybe a network issue since it happen on two units at the same time, but this has not been verified.

Any idea what causing this?

I am using version 2.13.

Configuration: uftpd -n -l err -o ftp=9013,tftp=0 /mnt/ramdisk

BR AD

troglobit commented 4 years ago

Yeah, that doesn't look good :-/

I'm running uftpd as an inetd service myself, so I haven't seen this unfortunately. I'll try to get some time to look at it during the weekend.

troglobit commented 4 years ago

Hi again,

for quite some time now I've tried replicating your bug report. I'm running a server with:

uftpd -n -l debug -o ftp=9013 tftp=0 writable /foo

and a client that repeatedly uploads a file every five seconds, without shutting down the connection:

lftp -p 9013 localhost -e 'repeat -d 5 put README.md'

Unfortunately I've had no luck so far as to replicating your problem. Is there anything else you could tell me about your case? E.g.:

ArthurDents commented 4 years ago

Hi, thank you for looking into this. This is a tricky one, and I am afraid I do not have a lot of additional information. The client is a custom windows application which has been used for many years. The files are a few kB to 200kB, connected to 100Mbit LAN, and the upload time is fast. The target is an arm processor (cross compiled). If I remember correctly, the client receives the files also when the zombie processes starts to appear. BR AD

troglobit commented 4 years ago

OK, since this seems to be non-trivial, I'll unfortunately have to put it on the back burner. I have a few embedded ARM targets which I can test this on, but I'm not sure when I'll have time to try and reproduce this specific case. Sorry! :-/

First order of business though is to port uftpd to myLinux, which is my testing ground for most of my projects. That'll probably happen during the upcoming weekend.

crisdawn commented 10 months ago

I also find this problem when I compile uftpd in aarch64 and run it in an aarch64 ubuntu pc. Client I use simply tftp command in a x86 ubuntu pc. image

troglobit commented 10 months ago

That's unfortunate. If you have a way of reliably reproducing this it world be very appreciated.

ArthurDents commented 5 months ago

A colleague of mine has been looked into this and has made this change: In ftpcmd.c in line 1682, return -1; has been changed to exit(-1);

...
fail:
    free(ctrl);
    shutdown(sd, SHUT_RDWR);
    close(sd);

    exit(-1); //return -1;
...

This seems to fix the "zombie-problem". We are using version 2.13.

BR AD

troglobit commented 5 months ago

@ArthurDents wow, that's amazing, nice catch!

I'm left wondering if _exit(1) would be enough to fix the issue ...