Open tolot27 opened 8 years ago
It would be much easier if the toolkit would use the system settings from the environment (http_proxy variable).
It's on our plate.
A word of caution about proxies - we realize that they're required in many environments, but they sometimes interfere with communications. We've seen some cases where some attention was required to configure them. In particular, they should respect range requests and should NOT try to cache SRA files.
Any schedule for the implementation?
BTW: Caching can be controlled with http request settings (Cache-Control: no-cache
and Pragma: no-cache
). These settings will be respected by most proxy servers.
We have a few projects in the middle of a release cycle right now. I don't have a date projected, but expect it in 2.6.1 or 2.6.2.
We intend to give much more attention to proxy support during the next development period of ncbi-vdb. The types of modifications you mention are trivial enough that they make it out the door in the next release (2.6.1). If not, expect them in 2.6.2.
We also have issues using a regular squid proxy using version 2.6.2.
Is there any news on this?
Yes. We were unable to schedule it for 2.6.2, largely because 2.6.2 was a release focused on sra-tools features, rather than VDB features. It is scheduled for 2.6.3, which will again introduce new features into VDB.
That said, authentication introduces more UI-related conditions, and these are not as simple for us to handle, given that many of our tools are used in automated pipelines. Sorry for the wait.
Any progress with this issue?
Yes, it is being worked on now. The first piece of support involves detection of proxy via environment variables - simple enough on the surface, but needs to be integrated into our configuration mechanism.
So we have integrated automatic proxy detection, and it will go out in our next release. Authentication is a relatively simple add-on after that, and will be available soon.
That said, the White House directive that all Federal websites and services will switch to HTTPS causes a wrinkle that is bound to make a lot of proxy configurations unhappy. We have been spending some time recently preparing for the switch to HTTPS, and the biggest concern is the wide variety of proxies and configurations that are out in the wild.
Dear
I have now tested 2.8.0 and still we see major/blocking issues in downloading the data via the sratoolkit. Via wget the download is working nicely, but the tool fails.
I have some output:
[bbnof@gquest] bin $ rm -rf /data/prod/Tools/sratoolkit [bbnof@gquest] bin $ ./test-sra SRR390728
A wget works:
[bbnof@gquest] sratoolkit.2.8.0-centos_linux64 $ wget http://sra-download.ncbi.nlm.nih.gov/srapub/SRR390728 --2016-10-10 13:38:22-- http://sra-download.ncbi.nlm.nih.gov/srapub/SRR390728 Resolving squidproxcl1.be.bayercropscience... 10.3.186.91 Connecting to squidproxcl1.be.bayercropscience|10.3.186.91|:8080... connected. Proxy request sent, awaiting response... 200 OK Length: 193615465 (185M) [application/octet-stream] Saving to: “SRR390728”
100%[=============================================================================================================================>] 193,615,465 713K/s in 4m 26s
2016-10-10 13:42:48 (712 KB/s) - “SRR390728” saved [193615465/193615465]
[bbnof@gquest] sratoolkit.2.8.0-centos_linux64 $
Hope this output helps you in detecting what could be going wrong at our end.
Regards, Filip Nollet
From: kwrodarmer [mailto:notifications@github.com] Sent: Friday, August 26, 2016 5:56 AM To: ncbi/sra-tools Cc: Filip Nollet; Comment Subject: Re: [ncbi/sra-tools] support proxy authentication (#28)
So we have integrated automatic proxy detection, and it will go out in our next release. Authentication is a relatively simple add-on after that, and will be available soon.
That said, the White House directive that all Federal websites and services will switch to HTTPS causes a wrinkle that is bound to make a lot of proxy configurations unhappy. We have been spending some time recently preparing for the switch to HTTPS, and the biggest concern is the wide variety of proxies and configurations that are out in the wild.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/ncbi/sra-tools/issues/28#issuecomment-242623557, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJHPTQ8awxmbGslfnRxsQ4PjjaxEd178ks5qjmPGgaJpZM4HpGMF.
The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly use, print, copy, forward, or disclose any part of this message. Please also delete this e-mail and all copies and notify the sender. Thank you.
We are off today, but will try to analyze the output you've given to see if anything pops out. We still have to put together a description of how proxies are to be supported on the wiki.
The problem with proxies is that they can be configured in a number of ways, and there are a number of proxy vendors and versions with different levels of support for http and now https. We are unlikely to be able to promise support for every version of every proxy in every configuration, so we're going to try instead to describe what we know we support.
We didn't yet add basic authentication, but we have added detection of environment variables. One thing you could try to see if it changes behavior is to go to vdb-config and indicate that you want to prefer settings in our configuration over the environment variables. The proxy settings should NOT contain "http://" but should be <hostname>:<port>. Give that a try and see if it changes anything. I see that you put "http://" into the environment variables, and that might throw things off.
Any guidance on how to set up a proxy on a cluster server? Should the proxy setting still be hostname:port, or does it now allow http or https?
We accept proxy specifications with scheme, if that is what you are referring to.
We read the environment variables
https_proxy
HTTPS_PROXY
all_proxy
ALL_PROXY
http_proxy
HTTP_PROXY
We also read the configuration nodes
/http/proxy/path
/http/proxy/enabled
The configuration node:
/http/proxy/use
allows 5 combinations of values that govern interpretation of inputs:
I'm on the cluster and I can get the example to download, but I can't get my restricted access file to download. See below.
bash-4.1$ fastq-dump -X 5 -Z SRR390728 Read 5 spots for SRR390728 Written 5 spots for SRR390728 @SRR390728.1 1 length=72 CATTCTTCACGTAGTTCTCGAGCCTTGGTTTTCAGCGATGGAGAATGACTTTGACAAGCTGAGAGAAGNTNC +SRR390728.1 1 length=72 ;;;;;;;;;;;;;;;;;;;;;;;;;;;9;;665142;;;;;;;;;;;;;;;;;;;;;;;;;;;;;96&&&&( @SRR390728.2 2 length=72 AAGTAGGTCTCGTCTGTGTTTTCTACGAGCTTGTGTTCCAGCTGACCCACTCCCTGGGTGGGGGGACTGGGT +SRR390728.2 2 length=72 ;;;;;;;;;;;;;;;;;4;;;;3;393.1+4&&5&&;;;;;;;;;;;;;;;;;;;;;<9;<;;;;;464262 @SRR390728.3 3 length=72 CCAGCCTGGCCAACAGAGTGTTACCCCGTTTTTACTTATTTATTATTATTATTTTGAGACAGAGCATTGGTC +SRR390728.3 3 length=72 -;;;8;;;;;;;,;;';-4,44;,:&,1,4'./&19;;;;;;669;;99;;;;;-;3;2;0;+;7442&2/ @SRR390728.4 4 length=72 ATAAAATCAGGGGTGTTGGAGATGGGATGCCTATTTCTGCACACCTTGGCCTCCCAAATTGCTGGGATTACA +SRR390728.4 4 length=72 1;;;;;;,;;4;3;38;8%&,,;);1;;,)/%4+,;1;;);;;;;;;4;(;1;;;;24;;;;41-444//0 @SRR390728.5 5 length=72 TTAAGAAATTTTTGCTCAAACCATGCCCTAAAGGGTTCTGTAATAAATAGGGCTGGGAAAACTGGCAAGCCA +SRR390728.5 5 length=72 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;9445552;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;446662 bash-4.1$ fastq-dump --outdir fasta --gzip --skip-technical --readids --read-filter pass --dumpbase --split-3 --clip SRR4351657 2017-11-15T19:09:30 fastq-dump.2.8.2 err: query unauthorized while resolving tree within virtual file system module - failed to resolve accession 'SRR4351657' - Access denied - please request permission to access phs000473/GRU in dbGaP ( 403 ) 2017-11-15T19:09:30 fastq-dump.2.8.2 err: item not found while constructing within virtual database module - the path 'SRR4351657' cannot be opened as database or table
I've already loaded my key file, and I even tried to run the fastq-dump command from my dbGaP directory as follows.
cd /users/zadafn8g/ncbi/dbGaP-15766/2.8.2/bin
But no luck getting my file to download. Any ideas?
The question is a bit too open-ended to answer. Make sure you've read the documentation about how to access dbGaP data at https://github.com/ncbi/sra-tools/wiki/First-help-on-decryption-dbGaP-data to make sure you understand how the workspace concept and configuration work. This setup must exist on your cluster, which is to say you should be running under your own account with access to your home directory in order to obtain the configuration information, and the tools must be run within a directory that is within the workspace you configured.
I've read through it all, and has successfully downloaded files on my local computer. On the cluster, I can't get it to work. I even tell it to cd to my dbGaP directory as follows: bash-4.1$ cd /users/zadafn8g/ncbi/dbGaP-15766 bash-4.1$ fastq-dump -X 5 -Z SRR722309 2017-11-15T19:37:06 fastq-dump.2.8.2 err: query unauthorized while resolving tree within virtual file system module - failed to resolve accession 'SRR722309' - Access denied - please request permission to access phs000473/GRU in dbGaP ( 403 ) 2017-11-15T19:37:06 fastq-dump.2.8.2 err: item not found while constructing within virtual database module - the path 'SRR722309' cannot be opened as database or table
Am I doing something wrong?
Access denied - please request permission to access phs000473/GRU in dbGaP ( 403 )
In just in case, the 403 is an HTTP error code, it is not related to proxy authentication. Code 407 would indicate authentication requirements/errors. Hence, I recommend opening a new issue for your problem.
@tolot27 - this is a good recommendation. @psddia - please open a new issue or write to us at sra-tools@ncbi.nlm.nih.gov and we can take this offline.
When do you plan to support basic authentication?
fastq-dump -X 5 -Z SRR390728 2023-08-12T06:08:51 fastq-dump.2.8.0 sys: connection failed while opening file within cryptographic module - mbedtls_ssl_handshake returned -9984 ( X509 - Certificate verification failed, e.g. CRL, CA or signature check failed ) 2023-08-12T06:08:51 fastq-dump.2.8.0 sys: mbedtls_ssl_get_verify_result returned 0x8 ( !! The certificate is not correctly signed by the trusted CA ) 2023-08-12T06:08:51 fastq-dump.2.8.0 err: item not found while constructing within virtual database module - the path 'SRR390728' cannot be opened as database or table
Currently, only proxy host and port settings can be configured but no authentication.
At least BASIC authentication should be implemented.