genouest / biomaj-download

Download microservice for BioMAJ
GNU Affero General Public License v3.0
1 stars 7 forks source link

CurlDownload crashses if cURL doesn't support SFTP #28

Closed duboism closed 3 years ago

duboism commented 4 years ago

Some distributions (for instance Ubuntu <= 18.04) provide a version of cURL compiled without SFTP support. Therefore, trying to set some SFTP-specific options (like SSH_KNOWNHOSTS) crashes with error CURLE_UNKNOWN_OPTION before download even if the protocol is not SFTP.

braffes commented 3 years ago

Hi,

I think, this issue is link to this one : https://github.com/curl/curl/issues/3493

We got the same issue with Centos8. With the original curl package, I get this version of library: libcurl/7.61.1 libssh/0.9.0/openssl/zlib

The temporary solution had to compile an other curl with libssh2. libcurl/7.61.1 libssh2/1.9.0

It would be nice if it is possible to have the choice to deactivate in the configuration file this functionality about SSH_KNOWNHOSTS. So the original version of curl will probably work.

Thanks for your attention.

Brice.

osallou commented 3 years ago

Will check for options, but if i am correct, this was needed for sftp. @duboism do you confirm?

If not using sftp, should not be an issue, do you confirm?

osallou commented 3 years ago

Option should indeed be set only for sftp/others? protocols

braffes commented 3 years ago

There is a issue even if sftp is not used. For example with a bank with directhttps protocol.

2020-12-22 18:42:07,631 INFO  [root][MainThread] Workflow:wf_download:DownloadSession:bc2e5518-68ae-45ad-b296-45f70f3aef5e                                                                                 
2020-12-22 18:42:07,635 INFO  [root][MainThread] Workflow:DownloadService:CleanSession
2020-12-22 18:42:07,635 ERROR [root][MainThread] Workflow:download:Exception:(48, '')
Traceback (most recent call last):
  File "/opt/biomaj/lib64/python3.6/site-packages/biomaj/workflow.py", line 131, in start
    self.session._session['status'][flow['name']] = getattr(self, 'wf_' + flow['name'])()
  File "/opt/biomaj/lib64/python3.6/site-packages/biomaj/workflow.py", line 1314, in wf_download
    (file_list, dir_list) = downloader.list()
  File "/opt/biomaj/lib64/python3.6/site-packages/biomaj_download/download/direct.py", line 156, in list                                                                                                   
    self._network_configuration()
  File "/opt/biomaj/lib64/python3.6/site-packages/biomaj_download/download/curl.py", line 182, in _network_configuration                                                                                   
    self.crl.setopt(pycurl.SSH_KNOWNHOSTS, self.ssh_hosts_file)
pycurl.error: (48, '')
2020-12-22 18:42:07,636 ERROR [root][MainThread] Error during task download
osallou commented 3 years ago

Ok so this is a bug, we should catch the error and ignore it, or set it only when necessary

osallou commented 3 years ago

@duboism do you wish to handle this?

osallou commented 3 years ago

will be fixed in biomaj-download 3.2.4

duboism commented 3 years ago

Hello,

Sorry, I'm off for the holidays. I started to work on this last summer but I could find enough time to finish. My idea was to inspect which protocols (py)curl supports at run time (because other protocols may be unsupported and the trigger errors) but this was complicated. The solution you proposed seems OK.