Open Andrei-Pozolotin opened 4 years ago
First of all, this server do not support MLSx
commands. You can ensure this with logging.basicConfig(level=logging.DEBUG)
before your code.
DEBUG:asyncio:Using selector: EpollSelector
INFO:aioftp.client:220
INFO:aioftp.client:USER anonymous
INFO:aioftp.client:331 Anonymous access allowed, send identity (e-mail name) as password.
INFO:aioftp.client:PASS anonymous@anonymous.host
INFO:aioftp.client:230 User logged in.
INFO:aioftp.client:TYPE I
INFO:aioftp.client:200 Type set to I.
INFO:aioftp.client:EPSV
INFO:aioftp.client:229 Entering Extended Passive Mode (|||37882|)
INFO:aioftp.client:MLSD /
INFO:aioftp.client:500 'MLSD /': command not understood.
INFO:aioftp.client:TYPE I
INFO:aioftp.client:200 Type set to I.
INFO:aioftp.client:EPSV
INFO:aioftp.client:229 Entering Extended Passive Mode (|||37883|)
INFO:aioftp.client:LIST /
INFO:aioftp.client:125 Data connection already open; Transfer starting.
...
Then you can see (via extra logging or wireshark) there is actual data with files:
INFO:aioftp.client:125 Data connection already open; Transfer starting.
b'09-12-12 12:06PM <DIR> aspnet_client\r\n'
b'09-12-12 08:52AM <DIR> atsactivity\r\n'
b'01-29-08 08:08AM <DIR> ClosingCross\r\n'
b'01-29-08 08:08AM <DIR> Downloads\r\n'
b'01-29-08 08:08AM <DIR> ETFData\r\n'
b'01-29-08 08:08AM <DIR> MonthlyShareVolume\r\n'
b'01-29-08 08:08AM <DIR> OpeningCross\r\n'
b'06-30-10 01:29PM <DIR> OrderExecutionQuality\r\n'
b'06-30-10 01:29PM <DIR> OrderExecutionQualityBX\r\n'
b'11-30-10 02:44PM <DIR> OrderExecutionQualityPSX\r\n'
b'09-23-08 07:34PM <DIR> phlx\r\n'
b'09-12-12 09:22AM <DIR> SymbolDirectory\r\n'
b''
INFO:aioftp.client:226 Transfer complete.
But the problem is in parsing part. I'm not a fan of a LIST
command since it has no strict format, it is for humans. This discussed a lot in the issues and each time this blows up I have an approvement that this command should not be used at all. If, and only if, @jw4js have time and energy to invest updates to LIST
parsing routine, then this will be fixed. Since, historicaly, the idea behind aioftp
was to not to use LIST
at all. Sorry for that, but legacy bites.
I've just released version 0.14.0 so you have an option to force your own parsing routine. https://aioftp.readthedocs.io/client_api.html#aioftp.Client
@pohmelie Nikita:
thank you so much for the fix, it works (see below)
may I suggest few other corrections to the project:
there is a typo: https://aioftp.readthedocs.io/client_api.html#aioftp.Client
“modify”, “type”, “type”, “size”
should read
“modify”, “type”, “size”
please use time.time
standard utc float timestamp representation for info['modify']
https://docs.python.org/3.8/library/time.html#time.time
please synchronize file modification time stamp upon transfer, so the following works:
await client.download(source=file_src, destination=file_dst, write_into=True)
assert info['modify'] == os.path.getmtime(file_dst) # TODO
please rename pohmelie
-> Nikita_Melentev
as they say: 更加尊重上级权威为孩子带来更好的业障 :-)
import re
import os
import time
import aioftp
import asyncio
import pathlib
from urllib.parse import urlparse
from typing import Tuple, Mapping
from datetime import datetime
this_dir = os.path.dirname(file) temp_dir = f"{this_dir}/tempdir"
def ftp_std_stamp(stamp:str) -> str: "convert remote stamp into aioftp format" return datetime.strptime(stamp, "%m-%d-%y%I:%M%p").strftime("%Y%m%d%H%M%S")
def ftp_line_parser(list_line:bytes) -> Tuple[pathlib.Path, Mapping]: """ parse ftp list lines such as: b'12-30-19 03:00AM
async def ftp_win_nt_list(remote_url:str, remote_path:str) -> None: "verify parse_list_line_custom" remote_bag = urlparse(remote_url) ftp_host = remote_bag.hostname ftp_port = remote_bag.port or aioftp.DEFAULT_PORT ftp_user = remote_bag.username or "anonymous" ftp_pass = remote_bag.password or "anonymous@anonymous.host" session = aioftp.ClientSession( host=ftp_host, port=ftp_port, user=ftp_user, password=ftp_pass, parse_list_line_custom=ftp_line_parser, ) async with session as client: entry_list = await client.list(path=remote_path) assert len(entry_list) > 0 for path, info in entry_list: print(path, info) await ftp_file_download(client, path, info)
async def ftp_file_download(client, path, info) -> None: if info['type'] == "file" and info['size'] <= 1024: print(f"ftp_file_download: {path}") file_src = path file_dst = f"{temp_dir}/{path}-{time.time()}" assert not os.path.exists(file_dst) await client.download(source=file_src, destination=file_dst, write_into=True) assert os.path.exists(file_dst) assert info['size'] == os.path.getsize(file_dst)
remote_url = "ftp://ftp.nasdaqtrader.com" remote_path = "/SymbolDirectory" asyncio.run(ftp_win_nt_list(remote_url, remote_path))
may I suggest few other corrections to the project
Feel free to make pull request. I fix the typo about double "type".
please use time.time standard utc float timestamp representation for info['modify']
Not sure if got you right, but all MLSx
facts are strings. More to say, there is a pretty strict description about modify
field and it is not an utc timestamp: https://tools.ietf.org/html/rfc3659#section-2.3
please synchronize file modification time stamp upon transfer, so the following works
This is good point. Not sure if it is a major issue (since no one use modification/creation file time at all), but I agreed with you. Feel free to make a PR.
please rename pohmelie -> Nikita_Melentev
This is irrelevant to aioftp
.
for example,
for the following snippet: client.list() returns empty list for windows nt ftp server
async def ftp_list(remote_url:str): remote_bag = urlparse(remote_url) ftp_host = remote_bag.hostname ftp_port = remote_bag.port or aioftp.DEFAULT_PORT ftp_user = remote_bag.username or "anonymous" ftp_pass = remote_bag.password or "anonymous@anonymous.host" session = aioftp.ClientSession( host=ftp_host, port=ftp_port, user=ftp_user, password=ftp_pass, ) async with session as client: entry_list = await client.list(path="/") print(entry_list) for path, info in entry_list: print(path, info)
remote_url = "ftp://ftp.nasdaqtrader.com" asyncio.run(ftp_list(remote_url))
ftp> help Commands may be abbreviated. Commands are:
! dir macdef proxy site $ disconnect mdelete sendport size account epsv4 mdir put status append form mget pwd struct ascii get mkdir quit system bell glob mls quote sunique binary hash mode recv tenex bye help modtime reget trace case idle mput rstatus type cd image newer rhelp user cdup ipany nmap rename umask chmod ipv4 nlist reset verbose close ipv6 ntrans restart ? cr lcd open rmdir delete lpwd passive runique debug ls prompt send