genouest / biomaj-download

Download microservice for BioMAJ
GNU Affero General Public License v3.0
1 stars 7 forks source link

Compatibility with non-standard FTP servers #9

Closed duboism closed 4 years ago

duboism commented 5 years ago

FTPDownload.list parses the listing according to the de facto standard for FTP servers. However some servers present the listing in a totally different manner hence they don't work with FTPDownload.

Example:

from biomaj_download.download.ftp import FTPDownload

ftp = FTPDownload("ftp", "test.rebex.net", "/")
ftp.set_credentials("demo:password")

(file_list, dir_list) = ftp.list()

Fails:

Traceback (most recent call last):
...
    rfile['size'] = int(parts[4])
IndexError: list index out of range

This server use the MS output style:

$ curl -u demo:password ftp://test.rebex.net/
10-27-15  04:46PM       <DIR>          pub
04-08-14  04:09PM                  403 readme.txt

Such servers a rare but supporting them should not be too difficult.

duboism commented 5 years ago

We propose to use the ftputil module which is able to parse output in both UNIX and MS formats.

duboism commented 5 years ago

Another option would be to allow the user to define parsing rules (similar to what is done with HTTPParse in HTTPDownload) with default to UNIX style.

osallou commented 5 years ago

Default should be basic ftp style, and why not specifying parsing rules with regexp optionnally

duboism commented 5 years ago

After thinking about it, it's a bit hard to parse FTP listings with regexp because the year is not always displayed if it's the current year. Implementation with ftputil is rather simple.

See #10.

duboism commented 4 years ago

I think we can close this bug since PR #10 solves this.