lavv17 / lftp

sophisticated command line file transfer program (ftp, http, sftp, fish, torrent)
http://lftp.yar.ru
GNU General Public License v3.0
1.1k stars 161 forks source link

Lftp 4.6.3a, stuck while pushing a file #269

Open homer242 opened 8 years ago

homer242 commented 8 years ago

Hi,

I found a problem with lftp version 4.6.3a. I looked over the commit history and I guess the problem is still here with more recent versions. Unless this fix "ftp: fixed ls freezing with unstable server connection." solved it.

I use lftp in embedded devices since 2 years and this is the first time I see this problem. My devices record some files and I use lftp to push these files to my ftp server. And, few days ago, one of these devices failed to send one file.

As I understand (I wasn't on the console when the problem appeared), this device tried to push a file to its ftp server but the internet connection was bad at this moment. So, lftp exited. The device retried to send this file and lftp is stuck in this process.

I have a cron to retry failed transfers:

*/10 * * * * nice -n 19 ftp-push flush

In the ftp-push script, when we do a flush, I use flock to be sure to have only one process which retry to push the failed files. I also create a lftp script and use a fifo to redirect lftp output to a syslog server like that:

    LFTP_SCRIPT=$(mktemp -t "lftp.XXXXXX")
    cat > $LFTP_SCRIPT <<EOF
set net:max-retries 1;
set net:timeout 60;
open '$FTP_URL';
rm -f '$FTP_DIR/$TARGET_DIR/$TARGET_FILE';
mirror --no-perms --reverse -f $LOCAL_FILE -O '$FTP_DIR/$TARGET_DIR';
EOF

    # Execute lftp and paste output to syslog
    LFTP_OUTPUT=$(mktemp -u -t "lpi.XXXXXX")
    mkfifo $LFTP_OUTPUT
    $LOGGER < $LFTP_OUTPUT &
    lftp -f $LFTP_SCRIPT > $LFTP_OUTPUT 2>&1
    LFTP_ERR_CODE=$?
    rm $LFTP_OUTPUT

This is how my system works.

Since 3 days, I'm seeing these line in ps faux and the process ids are the same:

 1875 root     /bin/sh -c nice -n 19 ftp-push flush
 1878 root     {ftp-push} /bin/sh /usr/bin/ftp-push flush
 1884 root     {ftp-push} /bin/sh /usr/bin/ftp-push flush
 1915 root     /usr/bin/logger -s -t /usr/bin/ftp-push[1878] -p local6 notice
 1916 root     lftp -f /tmp/lftp.bPIy9j

If I do a strace to the process 1916, I see this:

# strace -p 1916
Process 1916 attached
select(1, NULL, NULL, NULL, {374, 853261}) = 0 (Timeout)
gettimeofday({1473330059, 945343}, NULL) = 0
select(1, NULL, NULL, NULL, {3600, 0}

Bellow, I copy the list of file descriptors open by the process:

# ls -l /proc/1916/fd
total 0
lrwx------    1 root     root            64 Sep  8 12:23 0 -> /dev/null
l-wx------    1 root     root            64 Sep  8 12:23 1 -> /tmp/lpi.wnFSuS
l-wx------    1 root     root            64 Sep  8 12:23 14 -> /tmp/lock/ftp-push.lock
l-wx------    1 root     root            64 Sep  8 12:23 2 -> /tmp/lpi.wnFSuS
lr-x------    1 root     root            64 Sep  8 12:23 3 -> /root
lr-x------    1 root     root            64 Sep  8 12:23 4 -> /tmp/lftp.bPIy9j
lr-x------    1 root     root            64 Sep  8 12:23 6 -> /mnt/sdcard/records/16245006.txt

You can see the fifo /tmp/lpi.wnFSuS (1 and 2), the file used with flock /tmp/lock/ftp-push.lock (14), the generated lftp script (4) and the file we try to send to the ftp server /mnt/sdcard/records/16245006.txt (6).

What do you think about this problem ?

lavv17 commented 8 years ago

The news entry for 4.6.5 may be relevant (eb92e7615a72eb94472f8f231d07b85a618dfa6a). Please try the latest version.

Do you have a debug log from the session?