ncbi / sra-tools

SRA Tools
Other
1.1k stars 242 forks source link

support proxy authentication #28

Open tolot27 opened 8 years ago

tolot27 commented 8 years ago

Currently, only proxy host and port settings can be configured but no authentication.

At least BASIC authentication should be implemented.

tolot27 commented 8 years ago

It would be much easier if the toolkit would use the system settings from the environment (http_proxy variable).

kwrodarmer commented 8 years ago

It's on our plate.

A word of caution about proxies - we realize that they're required in many environments, but they sometimes interfere with communications. We've seen some cases where some attention was required to configure them. In particular, they should respect range requests and should NOT try to cache SRA files.

tolot27 commented 8 years ago

Any schedule for the implementation?

BTW: Caching can be controlled with http request settings (Cache-Control: no-cache and Pragma: no-cache). These settings will be respected by most proxy servers.

kwrodarmer commented 8 years ago

We have a few projects in the middle of a release cycle right now. I don't have a date projected, but expect it in 2.6.1 or 2.6.2.

We intend to give much more attention to proxy support during the next development period of ncbi-vdb. The types of modifications you mention are trivial enough that they make it out the door in the next release (2.6.1). If not, expect them in 2.6.2.

fnollet commented 8 years ago

We also have issues using a regular squid proxy using version 2.6.2.

Is there any news on this?

kwrodarmer commented 8 years ago

Yes. We were unable to schedule it for 2.6.2, largely because 2.6.2 was a release focused on sra-tools features, rather than VDB features. It is scheduled for 2.6.3, which will again introduce new features into VDB.

That said, authentication introduces more UI-related conditions, and these are not as simple for us to handle, given that many of our tools are used in automated pipelines. Sorry for the wait.

tolot27 commented 8 years ago

Any progress with this issue?

kwrodarmer commented 8 years ago

Yes, it is being worked on now. The first piece of support involves detection of proxy via environment variables - simple enough on the surface, but needs to be integrated into our configuration mechanism.

kwrodarmer commented 8 years ago

So we have integrated automatic proxy detection, and it will go out in our next release. Authentication is a relatively simple add-on after that, and will be available soon.

That said, the White House directive that all Federal websites and services will switch to HTTPS causes a wrinkle that is bound to make a lot of proxy configurations unhappy. We have been spending some time recently preparing for the switch to HTTPS, and the biggest concern is the wide variety of proxies and configurations that are out in the wild.

fnollet commented 7 years ago

Dear

I have now tested 2.8.0 and still we see major/blocking issues in downloading the data via the sratoolkit. Via wget the download is working nicely, but the tool fails.

I have some output:

[bbnof@gquest] bin $ rm -rf /data/prod/Tools/sratoolkit [bbnof@gquest] bin $ ./test-sra SRR390728

NCBI SRA Toolkit release version: 2.8.0.
Latest available NCBI SRA Toolkit release version: 2.7.0.
Your version of SRA Toolkit is more recent than the latest available.
Linux gquest 2.6.32-642.4.2.el6.x86_64 #1 SMP Mon Aug 15 02:06:41 EDT 2016 x86_64 ascp_locate = RC(rcNS,rcFile,rcCopying,rcFile,rcNotFound) 1000m false linux64 sra-toolkit test-sra.2.8 KNSManagerInitDNSEndpoint(www.ncbi.nlm.nih.gov, 80)=RC(rcNS,rcNoTarg,rcValidating,rcConnection,rcNotFound)
SRR390728 NotFound - NotFound
Local: not found
Remote HttpFasp: http://sra-download.ncbi.nlm.nih.gov/srapub/SRR390728 193,615,465 Unknown
Cache HttpFasp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra NotFound - NotFound
Cache.cache HttpFasp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra.cache NotFound - NotFound
Remote FaspHttp: fasp://dbtest@sra-download.ncbi.nlm.nih.gov:data/sracloud/srapub/SRR390728
Cache FaspHttp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra NotFound - NotFound
Cache.cache FaspHttp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra.cache NotFound - NotFound
``` VResolverQuery(SRR390728, (null), local, remote, cache)= RC(rcNoErr)
Remote HttpFasp: http://sra-download.ncbi.nlm.nih.gov/srapub/SRR390728
Cache HttpFasp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra NotFound - NotFound
Cache.cache HttpFasp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra.cache NotFound - NotFound
VResolverQuery(SRR390728, (null), local, remote=NULL, cache=NULL)= RC(rcVFS,rcTree,rcResolving,rcPath,rcNotFound)
VResolverQuery(SRR390728, (null), local=NULL, remote, cache)= RC(rcNoErr)
Remote HttpFasp: http://sra-download.ncbi.nlm.nih.gov/srapub/SRR390728
Cache HttpFasp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra NotFound - NotFound
Cache.cache HttpFasp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra.cache NotFound - NotFound
VResolverQuery(SRR390728, FaspHttpHttps, local, remote, cache)= RC(rcNoErr)
Remote FaspHttp: fasp://dbtest@sra-download.ncbi.nlm.nih.gov:data/sracloud/srapub/SRR390728
Cache FaspHttp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra NotFound - NotFound
Cache.cache FaspHttp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra.cache NotFound - NotFound
VResolverQuery(SRR390728, FaspHttpHttps, local, remote=NULL, cache=NULL)= RC(rcVFS,rcTree,rcResolving,rcPath,rcNotFound)
VResolverQuery(SRR390728, FaspHttpHttps, local=NULL, remote, cache)= RC(rcNoErr)
Remote FaspHttp: fasp://dbtest@sra-download.ncbi.nlm.nih.gov:data/sracloud/srapub/SRR390728
Cache FaspHttp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra NotFound - NotFound
Cache.cache FaspHttp: /data/prod/Tools/sratoolkit/ncbi/sra/SRR390728.sra.cache NotFound - NotFound
```
#1.1 SRR390728||SRR390728|193615465|2012-02-15T15:41:50|87f651f8de64b3cc1f53690c329589c1||http://sra-download.ncbi.nlm.nih.gov/srapub/SRR390728|200|ok VDBManagerOpenDBRead(SRR390728) = RC(rcNS,rcFile,rcReading,rcData,rcUnexpected) VDBManagerOpenDBRead(SRR390728) = RC(rcNS,rcFile,rcReading,rcData,rcUnexpected)
/var/tmp/bbnof/bgr1016/sratoolkit.2.8.0-centos_linux64/bin /vdb-passwd "test-sra" "/var/tmp/bbnof/bgr1016/sratoolkit.2.8.0-centos_linux64/bin" "RELEASE" "/home/bbnof" "/home/bbnof/.ncbi" "/home/bbnof/.ncbi/user-settings.mkfg" "linux" "/var/tmp/bbnof/bgr1016/sratoolkit.2.8.0-centos_linux64/bin" "bbnof" "false" "true" "abezwiv0080.emea.agrogroup.net:8080" "64" "gquest" "/home/bbnof/.ncbi" "user-settings.mkfg" "/vdb-passwd"
"https://www.ncbi.nlm.nih.gov/Traces/names/names.cgi"
"https://www.ncbi.nlm.nih.gov/Traces/names/names.cgi"
"/home/bbnof/ncbi"
"files" "nannot" "nannot" "refseq" "sra" "wgs" "/data/prod/Tools/sratoolkit/ncbi"
"-----BEGIN CERTIFICATE-----\x0D\x0AMIIDrzCCApegAwIBAgIQCDvgVpBCRrGhdWrJWZHHSjANBgkqhkiG9w0BAQUFADBh\x0D\x0AMQswCQYDVQQGEwJVUzEVMBMGA1UEChMMRGlnaUNlcnQgSW5jMRkwFwYDVQQLExB3\x0D\x0Ad3cuZGlnaWNlcnQuY29tMSAwHgYDVQQDExdEaWdpQ2VydCBHbG9iYWwgUm9vdCBD\x0D\x0AQTAeFw0wNjExMTAwMDAwMDBaFw0zMTExMTAwMDAwMDBaMGExCzAJBgNVBAYTAlVT\x0D\x0AMRUwEwYDVQQKEwxEaWdpQ2VydCBJbmMxGTAXBgNVBAsTEHd3dy5kaWdpY2VydC5j\x0D\x0Ab20xIDAeBgNVBAMTF0RpZ2lDZXJ0IEdsb2JhbCBSb290IENBMIIBIjANBgkqhkiG\x0D\x0A9w0BAQEFAAOCAQ8AMIIBCgKCAQEA4jvhEXLeqKTTo1eqUKKPC3eQyaKl7hLOllsB\x0D\x0ACSDMAZOnTjC3U/dDxGkAV53ijSLdhwZAAIEJzs4bg7/fzTtxRuLWZscFs3YnFo97\x0D\x0Anh6Vfe63SKMI2tavegw5BmV/Sl0fvBf4q77uKNd0f3p4mVmFaG5cIzJLv07A6Fpt\x0D\x0A43C/dxC//AH2hdmoRBBYMql1GNXRor5H4idq9Joz+EkIYIvUX7Q6hL+hqkpMfT7P\x0D\x0AT19sdl6gSzeRntwi5m3OFBqOasv+zbMUZBfHWymeMr/y7vrTC0LUq7dBMtoM1O/4\x0D\x0AgdW7jVg/tRvoSSiicNoxBN33shbyTApOB6jtSj1etX+jkMOvJwIDAQABo2MwYTAO\x0D\x0ABgNVHQ8BAf8EBAMCAYYwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQUA95QNVbR\x0D\x0ATLtm8KPiGxvDl7I90VUwHwYDVR0jBBgwFoAUA95QNVbRTLtm8KPiGxvDl7I90VUw\x0D\x0ADQYJKoZIhvcNAQEFBQADggEBAMucN6pIExIK+t1EnE9SsPTfrgT1eXkIoyQY/Esr\x0D\x0AhMAtudXH/vTBH1jLuG2cenTnmCmrEbXjcKChzUyImZOMkXDiqw8cvpOp/2PV5Adg\x0D\x0A06O/nVsJ8dWO41P0jmP6P6fbtGbfYmbW0W5BjfIttep3Sp+dWOIrWcBAI+0tKIJF\x0D\x0APnlUkiaY4IBIqDfv8NZ5YBberOgOzW6sRBc4L0na4UU+Krk2U886UAb3LujEV0ls\x0D\x0AYSEY1QSteDwsOoBrp+uvFRTp2InBuThs4pFsiv9kuXclVzDAGySj4dzp30d8tbQk\x0D\x0ACAUw7C29C79Fv1C5qfPrmAESrciIxpg0X40KPMbp1ZWVbd4wOTAeBggrBgEFBQcD\x0D\x0ABAYIKwYBBQUHAwEGCCsGAQUFBwMDDBdEaWdpQ2VydCBHbG9iYWwgUm9vdCBDQQ==\x0D\x0A-----END CERTIFICATE-----\x0D\x0A" "-----BEGIN CERTIFICATE-----\x0D\x0AMIIDxTCCAq2gAwIBAgIQAqxcJmoLQJuPC3nyrkYldzANBgkqhkiG9w0BAQUFADBs\x0D\x0AMQswCQYDVQQGEwJVUzEVMBMGA1UEChMMRGlnaUNlcnQgSW5jMRkwFwYDVQQLExB3\x0D\x0Ad3cuZGlnaWNlcnQuY29tMSswKQYDVQQDEyJEaWdpQ2VydCBIaWdoIEFzc3VyYW5j\x0D\x0AZSBFViBSb290IENBMB4XDTA2MTExMDAwMDAwMFoXDTMxMTExMDAwMDAwMFowbDEL\x0D\x0AMAkGA1UEBhMCVVMxFTATBgNVBAoTDERpZ2lDZXJ0IEluYzEZMBcGA1UECxMQd3d3\x0D\x0ALmRpZ2ljZXJ0LmNvbTErMCkGA1UEAxMiRGlnaUNlcnQgSGlnaCBBc3N1cmFuY2Ug\x0D\x0ARVYgUm9vdCBDQTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMbM5XPm\x0D\x0A+9S75S0tMqbf5YE/yc0lSbZxKsPVlDRnogocsF9ppkCxxLeyj9CYpKlBWTrT3JTW\x0D\x0APNt0OKRKzE0lgvdKpVMSOO7zSW1xkX5jtqumX8OkhPhPYlG++MXs2ziS4wblCJEM\x0D\x0AxChBVfvLWokVfnHoNb9Ncgk9vjo4UFt3MRuNs8ckRZqnrG0AFFoEt7oT61EKmEFB\x0D\x0AIk5lYYeBQVCmeVyJ3hlKV9Uu5l0cUyx+mM0aBhakaHPQNAQTXKFx01p8VdteZOE3\x0D\x0AhzBWBOURtCmAEvF5OYiiAhF8J2a3iLd48soKqDirCmTCv2ZdlYTBoSUeh10aUAsg\x0D\x0AEsxBu24LUTi4S8sCAwEAAaNjMGEwDgYDVR0PAQH/BAQDAgGGMA8GA1UdEwEB/wQF\x0D\x0AMAMBAf8wHQYDVR0OBBYEFLE+w2kD+L9HAdSYJhoIAu9jZCvDMB8GA1UdIwQYMBaA\x0D\x0AFLE+w2kD+L9HAdSYJhoIAu9jZCvDMA0GCSqGSIb3DQEBBQUAA4IBAQAcGgaX3Nec\x0D\x0AnzyIZgYIVyHbIUf4KmeqvxgydkAQV8GK83rZEWWONfqe/EW1ntlMMUu4kehDLI6z\x0D\x0AeM7b41N5cdblIZQB2lWHmiRk9opmzN6cN82oNLFpmyPInngiK3BD41VHMWEZ71jF\x0D\x0AhS9OMPagMRYjyOfiZRYzy78aG6A9+MpeizGLYAiJLQwGXFK3xPkKmNEVX58Svnw2\x0D\x0AYzi9RKR/5CYrCsSXaQ3pjOLAEFe4yHYSkVXySGnYvCoCWw9E1CAx2/S6cCZdkGCe\x0D\x0AvEsXCS+0yx5DaMkHJ8HSXPfqIbloEpw8nL+e/IBcm2PN7EeqJSdnoDfzAIJ9VNep\x0D\x0A+OkuE6N36B9KMEQwHgYIKwYBBQUHAwQGCCsGAQUFBwMBBggrBgEFBQcDAwwiRGln\x0D\x0AaUNlcnQgSGlnaCBBc3N1cmFuY2UgRVYgUm9vdCBDQQ==\x0D\x0A-----END CERTIFICATE-----\x0D\x0A" "1000m" "/var/tmp/bbnof/bgr1016/sratoolkit.2.8.0-centos_linux64/bin"

A wget works:

[bbnof@gquest] sratoolkit.2.8.0-centos_linux64 $ wget http://sra-download.ncbi.nlm.nih.gov/srapub/SRR390728 --2016-10-10 13:38:22-- http://sra-download.ncbi.nlm.nih.gov/srapub/SRR390728 Resolving squidproxcl1.be.bayercropscience... 10.3.186.91 Connecting to squidproxcl1.be.bayercropscience|10.3.186.91|:8080... connected. Proxy request sent, awaiting response... 200 OK Length: 193615465 (185M) [application/octet-stream] Saving to: “SRR390728”

100%[=============================================================================================================================>] 193,615,465 713K/s in 4m 26s

2016-10-10 13:42:48 (712 KB/s) - “SRR390728” saved [193615465/193615465]

[bbnof@gquest] sratoolkit.2.8.0-centos_linux64 $

Hope this output helps you in detecting what could be going wrong at our end.

Regards, Filip Nollet

From: kwrodarmer [mailto:notifications@github.com] Sent: Friday, August 26, 2016 5:56 AM To: ncbi/sra-tools Cc: Filip Nollet; Comment Subject: Re: [ncbi/sra-tools] support proxy authentication (#28)

So we have integrated automatic proxy detection, and it will go out in our next release. Authentication is a relatively simple add-on after that, and will be available soon.

That said, the White House directive that all Federal websites and services will switch to HTTPS causes a wrinkle that is bound to make a lot of proxy configurations unhappy. We have been spending some time recently preparing for the switch to HTTPS, and the biggest concern is the wide variety of proxies and configurations that are out in the wild.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/ncbi/sra-tools/issues/28#issuecomment-242623557, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJHPTQ8awxmbGslfnRxsQ4PjjaxEd178ks5qjmPGgaJpZM4HpGMF.


The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you.


kwrodarmer commented 7 years ago

We are off today, but will try to analyze the output you've given to see if anything pops out. We still have to put together a description of how proxies are to be supported on the wiki.

The problem with proxies is that they can be configured in a number of ways, and there are a number of proxy vendors and versions with different levels of support for http and now https. We are unlikely to be able to promise support for every version of every proxy in every configuration, so we're going to try instead to describe what we know we support.

We didn't yet add basic authentication, but we have added detection of environment variables. One thing you could try to see if it changes behavior is to go to vdb-config and indicate that you want to prefer settings in our configuration over the environment variables. The proxy settings should NOT contain "http://" but should be <hostname>:<port>. Give that a try and see if it changes anything. I see that you put "http://" into the environment variables, and that might throw things off.

psddia commented 6 years ago

Any guidance on how to set up a proxy on a cluster server? Should the proxy setting still be hostname:port, or does it now allow http or https?

kwrodarmer commented 6 years ago

We accept proxy specifications with scheme, if that is what you are referring to.

We read the environment variables https_proxy HTTPS_PROXY all_proxy ALL_PROXY http_proxy HTTP_PROXY

We also read the configuration nodes /http/proxy/path /http/proxy/enabled

The configuration node: /http/proxy/use allows 5 combinations of values that govern interpretation of inputs:

  1. "env" - prefer environment variable over configuration
  2. "kfg" - prefer configuration variable over environment
  3. "none" - disable proxy even when environment variable is set
  4. "env,kfg" - try to use both, giving preference to environment
  5. "kfg,env" - try to use both, giving preference to configuration
psddia commented 6 years ago

I'm on the cluster and I can get the example to download, but I can't get my restricted access file to download. See below.

bash-4.1$ fastq-dump -X 5 -Z SRR390728 Read 5 spots for SRR390728 Written 5 spots for SRR390728 @SRR390728.1 1 length=72 CATTCTTCACGTAGTTCTCGAGCCTTGGTTTTCAGCGATGGAGAATGACTTTGACAAGCTGAGAGAAGNTNC +SRR390728.1 1 length=72 ;;;;;;;;;;;;;;;;;;;;;;;;;;;9;;665142;;;;;;;;;;;;;;;;;;;;;;;;;;;;;96&&&&( @SRR390728.2 2 length=72 AAGTAGGTCTCGTCTGTGTTTTCTACGAGCTTGTGTTCCAGCTGACCCACTCCCTGGGTGGGGGGACTGGGT +SRR390728.2 2 length=72 ;;;;;;;;;;;;;;;;;4;;;;3;393.1+4&&5&&;;;;;;;;;;;;;;;;;;;;;<9;<;;;;;464262 @SRR390728.3 3 length=72 CCAGCCTGGCCAACAGAGTGTTACCCCGTTTTTACTTATTTATTATTATTATTTTGAGACAGAGCATTGGTC +SRR390728.3 3 length=72 -;;;8;;;;;;;,;;';-4,44;,:&,1,4'./&19;;;;;;669;;99;;;;;-;3;2;0;+;7442&2/ @SRR390728.4 4 length=72 ATAAAATCAGGGGTGTTGGAGATGGGATGCCTATTTCTGCACACCTTGGCCTCCCAAATTGCTGGGATTACA +SRR390728.4 4 length=72 1;;;;;;,;;4;3;38;8%&,,;);1;;,)/%4+,;1;;);;;;;;;4;(;1;;;;24;;;;41-444//0 @SRR390728.5 5 length=72 TTAAGAAATTTTTGCTCAAACCATGCCCTAAAGGGTTCTGTAATAAATAGGGCTGGGAAAACTGGCAAGCCA +SRR390728.5 5 length=72 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;9445552;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;446662 bash-4.1$ fastq-dump --outdir fasta --gzip --skip-technical --readids --read-filter pass --dumpbase --split-3 --clip SRR4351657 2017-11-15T19:09:30 fastq-dump.2.8.2 err: query unauthorized while resolving tree within virtual file system module - failed to resolve accession 'SRR4351657' - Access denied - please request permission to access phs000473/GRU in dbGaP ( 403 ) 2017-11-15T19:09:30 fastq-dump.2.8.2 err: item not found while constructing within virtual database module - the path 'SRR4351657' cannot be opened as database or table

psddia commented 6 years ago

I've already loaded my key file, and I even tried to run the fastq-dump command from my dbGaP directory as follows.

cd /users/zadafn8g/ncbi/dbGaP-15766/2.8.2/bin

But no luck getting my file to download. Any ideas?

kwrodarmer commented 6 years ago

The question is a bit too open-ended to answer. Make sure you've read the documentation about how to access dbGaP data at https://github.com/ncbi/sra-tools/wiki/First-help-on-decryption-dbGaP-data to make sure you understand how the workspace concept and configuration work. This setup must exist on your cluster, which is to say you should be running under your own account with access to your home directory in order to obtain the configuration information, and the tools must be run within a directory that is within the workspace you configured.

psddia commented 6 years ago

I've read through it all, and has successfully downloaded files on my local computer. On the cluster, I can't get it to work. I even tell it to cd to my dbGaP directory as follows: bash-4.1$ cd /users/zadafn8g/ncbi/dbGaP-15766 bash-4.1$ fastq-dump -X 5 -Z SRR722309 2017-11-15T19:37:06 fastq-dump.2.8.2 err: query unauthorized while resolving tree within virtual file system module - failed to resolve accession 'SRR722309' - Access denied - please request permission to access phs000473/GRU in dbGaP ( 403 ) 2017-11-15T19:37:06 fastq-dump.2.8.2 err: item not found while constructing within virtual database module - the path 'SRR722309' cannot be opened as database or table

Am I doing something wrong?

tolot27 commented 6 years ago

Access denied - please request permission to access phs000473/GRU in dbGaP ( 403 )

In just in case, the 403 is an HTTP error code, it is not related to proxy authentication. Code 407 would indicate authentication requirements/errors. Hence, I recommend opening a new issue for your problem.

kwrodarmer commented 6 years ago

@tolot27 - this is a good recommendation. @psddia - please open a new issue or write to us at sra-tools@ncbi.nlm.nih.gov and we can take this offline.

tolot27 commented 4 years ago

When do you plan to support basic authentication?

Yogesh1-11 commented 1 year ago

fastq-dump -X 5 -Z SRR390728 2023-08-12T06:08:51 fastq-dump.2.8.0 sys: connection failed while opening file within cryptographic module - mbedtls_ssl_handshake returned -9984 ( X509 - Certificate verification failed, e.g. CRL, CA or signature check failed ) 2023-08-12T06:08:51 fastq-dump.2.8.0 sys: mbedtls_ssl_get_verify_result returned 0x8 ( !! The certificate is not correctly signed by the trusted CA ) 2023-08-12T06:08:51 fastq-dump.2.8.0 err: item not found while constructing within virtual database module - the path 'SRR390728' cannot be opened as database or table