microsoft / AzureStorageExplorer

Easily manage the contents of your storage account with Azure Storage Explorer. Upload, download, and manage blobs, files, queues, tables, and Cosmos DB entities. Gain easy access to manage your virtual machine disks. Work with either Azure Resource Manager or classic storage accounts, plus manage and configure cross-origin resource sharing (CORS) rules.
Creative Commons Attribution 4.0 International
370 stars 85 forks source link

Unable to download any file from a Data Lake Storage Gen1 #4430

Closed JeanBaptisteScellier closed 3 years ago

JeanBaptisteScellier commented 3 years ago

Preflight Checklist

Storage Explorer Version

1.19.1

Platform

Windows

OS Version

Windows 10 Enterprise

Architecture

i86

Build Number

20210425.1

Bug Description

I'm unable to download any file from a Data Lake Storage Gen1.

Steps to Reproduce

  1. Launch Azure Storage Explorer
  2. Use proxy settings from environment variables with http_proxy set to http://username@hostname:port
  3. Open a data lake storage gen1 in the explorer
  4. Select a file
  5. Click on download

Actual Experience

I tried to download a file from an ADLS Gen1. Even 6kB. When I try to do so, the software tries to download it several times and finally the activity bar shows forever "Downloading 'filelocation/filename': Retry Attempt 5 out of 5".

Expected Experience

I expect the file to be downloaded.

Additional Context

I can download files from Storage accounts without any problem.

I am using a proxy. It is defined in my environment variables. http_proxy, https_proxy and are sets correctly. I'm sure of it because I use pip and conda with this configuration. I added the certificates for SSL in a file that contains all the certificates I usually need.
Though, when I launch the application, I get this message: "Impossible d'établir une connexion à l'aide des paramètres de proxy actuels. Vérifiez les paramètres actuels en accédant à la boîte de dialogue de proxy dans la barre d'outils.".

The logs I get are: 2021-05-07_183328_ActivityProvider_7716.log:
[2021-05-07T16:34:47.368Z] (ActivityProvider_7716) <INFO> Activity added {
  title: "Queued Download: 'CIDI/REFINERY4.0/STAGING/PI/RC/DGS/SSP/2020/06/01/PI-D1'",
  message: '',
  speed: undefined,
  progress: 0,
  status: 0,
  actions: []
}

and: 
[2021-05-07T16:33:31.695Z] (renderer) <INFO> Proxy configuration source set to: environmentVariable
[2021-05-07T16:33:31.697Z] (renderer) <INFO> Proxy configuration set to: {
  protocol: 'http:',
  hostname: 'XX.XX.XXX.XX',
  port: YYYY,
  useCredentials: false,
  credential: '***'
}

The last one log is the most interesting because the parameter useCredentials should be set to true because my http_proxy variable is 'http://username@XX.XX.XXX.XX:YYYY' with no password. So I tried to set it manually instead of using environment variables but Azure Explorer doesn't accept to use username without password for proxy settings.
MRayermannMSFT commented 3 years ago

@JeanBaptisteScellier you said this was a regression from "1". Can you clarify what you mean by that? When something is a regression, that means it worked in a previous version. So, if you are able to go download a previous version and what you were trying to do works, then this would be a regression?

Can you also clarify what you mean by "I can download files from Storage accounts without any problem."?

JeanBaptisteScellier commented 3 years ago

@MRayermannMSFT Sorry I removed the regression parameter. It looked mandatory at first to me. I don't know how to clarify that: I'm able to download files located on my Storage accounts. What do you want to know?

MRayermannMSFT commented 3 years ago

@JeanBaptisteScellier it looks like we require that you also have a password for your proxy credentials.

We'll need to fix this assuming you actually don't have a password for your proxy.

JeanBaptisteScellier commented 3 years ago

Thank you for your answer @MRayermannMSFT I don't have a password, the proxy is configured without password. I tried your workaround but it fails.

MRayermannMSFT commented 3 years ago

Ok, we'll see if we can fit in some work to unblock you in our next release. 🤞

MRayermannMSFT commented 3 years ago

@craxal one of the two of us should try to deal with this!

MRayermannMSFT commented 3 years ago

@JeanBaptisteScellier I have a private build for you to try here: https://storageexplorerpublish.blob.core.windows.net/privatebuilds/4430/StorageExplorer-ia32.exe

That build should hopefully work with you "no password" version of the environment variable.

Given this is a private/pre-release build. Please only use this for testing the fix for this bug. That is, see if:

  1. there is no longer an error on startup regarding your proxy
  2. you can now download files from your data lake

Assuming it works, then I recommend you go back to version 1.19 of Storage Explorer. I think I can help you come up with a patch for it as well if you are willing to edit a file or two. :)

Thanks!

JeanBaptisteScellier commented 3 years ago

Thank you for the private build @MRayermannMSFT . It is not working though. Downloading retries with no end. For proxy there is no message at start any more. In backend, there are some changes. But it still impossible to set manual proxy configuration in Azure Explorer with username but no password (the "ok" button doesn't activate when password is missing). So I tried with environment variables. It consists in http_proxy set to http://ZZZZZZZZ@XX.XX.XXX.XX:YYYY. Azure Explorer detected well that there is no password because in one of the log I see ProxySettings: {"host":"http://XX.XX.XXX.XX","port":YYYY,"username":"ZZZZZZZZ","password":""} In the main.log I show you some interesting things. First about proxy configuration:

[2021-06-10T11:52:10.762Z] ( main) <INFO> Proxy configuration set to: {  
  protocol: 'http:',
  hostname: 'XX.XX.XXX.XX',
  port: YYYY,
  useCredentials: true,
  credential: '***'
}

But the log shows an error:

[2021-06-10T11:52:15.385Z] ( main) <INFO> Extensions loaded
[2021-06-10T11:52:16.519Z] ( main) <INFO> Initializing user accounts manager AAD providers...
[2021-06-10T11:52:17.058Z] ( main) <ERRO> UNHANDLED PROMISE REJECTION in main process:
TypeError: Cannot read property 'ErrorProxyTestFailed' of undefined`
MRayermannMSFT commented 3 years ago

@JeanBaptisteScellier

But the log shows an error:

Oops, I thought this was fixed by someone else's change but I just realized they haven't finished that work yet. I'll go ahead and work that in to this private build.

But it still impossible to set manual proxy configuration in Azure Explorer with username but no password

Oh I didn't realize you wanted this. I was focusing on the environment variable path since that is what you have been using.

Ok here is a build which:

When evaluating the build, please try to test each of the following options for sourcing your proxy settings:

(1) This option is not supported by all features. At this time, it is mainly supported by blob and queue features. I don't have a full list to share with you, but if it seems like your work is limited to supported features then maybe this is a good option for you. (2) Depending on your OS settings, you may still want to "Enable Credentials" and specify your username in the proxy dialog.

Ok, I'm hoping for good things. Here's the link to the build: https://storageexplorerpublish.blob.core.windows.net/privatebuilds/4430-2/StorageExplorer-ia32.exe

JeanBaptisteScellier commented 3 years ago

Hello @MRayermannMSFT You fixed well the error that prompted in main.log. However I get again the message ""Impossible d'établir une connexion à l'aide des paramètres de proxy actuels. Vérifiez les paramètres actuels en accédant à la boîte de dialogue de proxy dans la barre d'outils.". Thank you for allowing me to use App settings.

I tried the three options:

To help, I would like to ask you how you handle the absence of password. Do you just set it to an empty string? Because I think it wouldn't be enough. A typical proxy adress with credentials is http://username:password@proxy.thing.​com:8080 but without password there is no more the colon: http://username@proxy.thing.​com:8080. Maybe you already handle this but I want to check.

MRayermannMSFT commented 3 years ago

To help, I would like to ask you how you handle the absence of password. Do you just set it to an empty string? Because I think it wouldn't be enough. A typical proxy adress with credentials is http://username:password@proxy.thing.​com:8080 but without password there is no more the colon: http://username@proxy.thing.​com:8080. Maybe you already handle this but I want to check.

We're still including the colon. Getting rid of the colon is a much bigger change than what I've done. I was hoping this would work without having to do that/I was hoping that proxy servers would be smarter...

So I guess we'll have to modify all the areas we construct the proxy URL to get rid of the colon. I'll have a new build for you later today.

MRayermannMSFT commented 3 years ago

Hey @JeanBaptisteScellier, first, thanks for sticking with me while we work through this together!

Ok, I have a build which removes the : from the proxy auth header in as many places as we can control...

The good news is that it looks like every code path for Gen1 is good to go. I checked with Fiddler, and I don't see any requests that include the :.

The bad news is that code paths for a lot of non-Gen1 stuff is not good to go. This mainly means blob, queue, and AzCopy features (thankfully AzCopy is not used for Gen1 uploads and downloads). These "not good to go" areas either:

🙁

I've opened some GitHub issues against the things that are at fault for this:

I don't think either of those are going to be resolved in time for our 1.20 release though. So for you, if you're just using Gen1 stuff I think you'll be good to go. But if you're using a mix of Gen1 and Storage Accounts....then a bunch of features are going to continue not working for you.

Ok, so here's a new private build for you to try: https://storageexplorerpublish.blob.core.windows.net/privatebuilds/4430-3/StorageExplorer-ia32.exe

Finally, with regards to the proxy you are using, is it not configured in your Windows settings? I was hoping the system proxy option would take care of this issue for you, but maybe your situation is special somehow. I'd love to learn more about your proxy, why you use it, how it is configured, what type of proxy it is, etc. If you have time to share that info with me after trying the private build, please do so. Thanks!

JeanBaptisteScellier commented 3 years ago

Hello @MRayermannMSFT , thank you again for your time and your build !

About my proxy:

It is a corporate proxy. The adress is the same for everybody in my location, only the username changes. It prevents mainstream employees from using HTTP outside corporate applications and web browser. For other applications, proxy configuration is needed. So in every software I use, I need to configure the proxy settings. I can set it manually in applications, but most applications deal with environment variables so I just set HTTP_PROXY, HTTPS_PROXY in my windows user environment variables. And also REQUESTS_CA_BUNDLE for some applications and points to a file with some certificates for ssl verification. I can only connect to Azure through my corporate PC because authentication is not possible from PC outside those given by my corporate. System proxy option cannot be accessed, I think cybersecurity teams don't want us to play with proxy. For example in a python code, instead of writing:

c = http.client.HTTPSConnection('www.google.com', context=ssl._create_unverified_context(), timeout=1)
c.request("GET", '/')

I write:

c = http.client.HTTPSConnection('XX.XX.XXX.XX', port='YYYY', context=ssl._create_unverified_context(), timeout=1)
auth_hash = base64.b64encode(bytes('ZZZZZZZZ', encoding="utf-8")).decode("utf-8")
headers = {
    'Proxy-Authorization': f"Basic {auth_hash}"
}
c.set_tunnel("www.google.com", headers=headers)
c.request("GET", '/', headers=headers)

Test of last build

I've tried your new build. Behaviour stays the same:

Logs look like the same. Only one thing to mention, in the managed disks extension (which I do not use though) log. There is ProxySettings: {"host":"http://XX.XX.XXX.XX","port":YYYY,"password":"******"} Whereas in the blob extension log, there is ProxySettings: {"host":"http://XX.XX.XXX.XX","port":YYYY,"username":"ZZZZZZZZ","password":"******"} I tried both app config and environment variables, same behaviour.

Conclusion

It would help to have more logs, to understand in details what causes the connection to fail. Whether it is a http request error or SSL verification or if it is a time out.

MRayermannMSFT commented 3 years ago

And also REQUESTS_CA_BUNDLE for some applications and points to a file with some certificates for ssl verification.

Oh oh oh oh. This is potentially the thing we are missing then. Storage Explorer definitely has some quirks with SSL certificates. We don't support that environment variable right now, but I can guide you on how to get Storage Explorer to trust the certs.

Basically you need to take your certificates and convert them to Base-64 encoded X.509. In Windows that's really easy. Just double click on any .cer, go to Details, and Copy to File..., then just follow the wizard:

image

After you have the new .cer file, in Storage Explorer you just need to go to Edit > SSL Certificates > Import Certificates, and then use the file picker to find, select, and open the .cer file.

It would help to have more logs, to understand in details what causes the connection to fail. Whether it is a http request error or SSL verification or if it is a time out.

For sure, I'll add some logging to the proxy testing today so we can see what exactly the error is.

By chance, does your company use Teams? It might be beneficial for you and I to have a call if we can't get this figured out soon. Maybe it'll be faster for us to get to the bottom of things that way.

MRayermannMSFT commented 3 years ago

Ok @JeanBaptisteScellier, new build with logging: https://storageexplorerpublish.blob.core.windows.net/privatebuilds/4430-4/StorageExplorer-ia32.exe

You can find the error in the log file for "main". It'll look something like this after the error info bar pops up: image

JasonYeMSFT commented 3 years ago

We have shipped 1.20.0. We are going to track this issue in 1.21.0.

MRayermannMSFT commented 3 years ago

Been about a month and a half now since we last heard from Jean. Going to go ahead and close this now. @JeanBaptisteScellier if you're still having problems after updating to 1.20.0 then please open a new issue.