Open ddelabru opened 4 years ago
Lftp cannot parse recusive listings yet. For speed up you can try these settings: set ftp:sync-mode off set ftp:use-stat-for-list on
чт, 2 янв. 2020, 18:57 Dominic Delabruere notifications@github.com:
I'm trying to use LFTP (v4.8.4, on Fedora 31) in a script where I need to obtain a listing of all the full file paths on a remote FTP server. The obvious choice for this task is the find LFTP command, and the output of this command does have everything I need in an easy-to-use format, but it takes 58 minutes to complete! In the same environment, the ls -R LFTP command completes in ~37.5 seconds, but it seems the FTP server cannot be coerced into displaying a "short" file listing or otherwise formatting the output in a way that is easier to use. The cls LFTP command does not appear to have a recursive mode.
I'm not sure whether it's relevant, but the FTP server I'm crawling does not seem to support the MLSD command.
Is there a way I can obtain output in the format of find at the kind of speed provided by ls, without parsing the long-form directory listings myself?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lavv17/lftp/issues/559?email_source=notifications&email_token=AAHLWXFTNAY6M5QMFNTR6NTQ3YFINA5CNFSM4KCDY4V2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IDWP4EA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLWXBMUV725HQ4AL5Q7BLQ3YFINANCNFSM4KCDY4VQ .
Thanks! I ended up parsing the the recursive listing myself after all, since someone shared a helpful regex with me...
After looking at the code for the find
command I understand now why this is tricky -- if the FTP server doesn't support recursive listings you have to send repeated CWD
commands, and that takes a lot of time -- and on top of that the recursive listings are not guaranteed to follow the same format across different FTP servers. I'd be willing to try to contribute code but I don't think I know a better approach to implement; the best I can think of is trying the kind of parsing approach I'm doing, then falling back to the current behavior if the format is not as expected or if the server does not support recursive listings, but that might be more complexity than is desirable
Have you tried these settings? set ftp:sync-mode off set ftp:use-stat-for-list on
I just tried this list of commands in an lftp session:
set ftp:sync-mode off
set ftp:use-stat-for-list on
find
(Actually, the find command is still running in the background.) I can tell it's still quite a bit slower than ls -R
, but I've only let it run for a few minutes so far so I can't tell you exactly how long it takes to complete.
Alright, after timing a full run of that set of commands (with time lftp -e "set ftp:sync-mode off ; set ftp:use-stat-for-list ; find" ftp.redhat.com
) I have a figure of ~50 minutes, for the same server described previously.
Probably you missed the value for use-stat-for-list
ср, 15 янв. 2020, 23:42 Dominic Delabruere notifications@github.com:
Alright, after timing a full run of that set of commands (with time lftp -e "set ftp:sync-mode off ; set ftp:use-stat-for-list ; find" ftp.redhat.com) I have a figure of ~50 minutes, for the same server described previously.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lavv17/lftp/issues/559?email_source=notifications&email_token=AAHLWXHZQP343EHUM5OX2TDQ55YMZA5CNFSM4KCDY4V2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJBXT5Q#issuecomment-574847478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLWXFHGKYULM3YPOGKIXLQ55YMZANCNFSM4KCDY4VQ .
Probably you missed the value for use-stat-for-list
Ah, yes, I included it on my first try, when I was using lftp in interactive mode, but forgot it when I ran time
. I am running time lftp -e "set ftp:sync-mode off ; set ftp:use-stat-for-list on ; find" ftp.redhat.com
now and will post the results when it completes.
I am running
time lftp -e "set ftp:sync-mode off ; set ftp:use-stat-for-list on ; find" ftp.redhat.com
now and will post the results when it completes.
That did cut the time down significantly from the default behavior of find
:
real 19m25.566s
user 2m34.084s
sys 0m6.495s
I'm trying to use LFTP (v4.8.4, on Fedora 31) in a script where I need to obtain a listing of all the full file paths on a remote FTP server. The obvious choice for this task is the
find
LFTP command, and the output of this command does have everything I need in an easy-to-use format, but it takes 58 minutes to complete! In the same environment, thels -R
LFTP command completes in ~37.5 seconds, but it seems the FTP server cannot be coerced into displaying a "short" file listing or otherwise formatting the output in a way that is easier to use. Thecls
LFTP command does not appear to have a recursive mode.I'm not sure whether it's relevant, but the FTP server I'm crawling does not seem to support the
MLSD
command.Is there a way I can obtain output in the format of
find
at the kind of speed provided byls
, without parsing the long-form directory listings myself?