steveloughran / cloudstore

Hadoop utility jar for troubleshooting integration with cloud object stores
Apache License 2.0
33 stars 11 forks source link

Connecting ADLS throug auth enabled proxy #1

Closed bfejervari closed 4 years ago

bfejervari commented 4 years ago

Hello, we want to connect a Cloudera Cluster to ADLS in a corporate environment. In the environment, every connection towards :443, :80 is blocked by central firewall - so any connection should use proxy, which has user+pw authenticated working method.

I used cloudstore-1.0.jar to test the connection and couldn't succeed.

I attach a simple script and the outputs of the run: cloudstore_ADLS_proxy_test.zip

The questions: how could I make cloudstore storediag (and any other hadoop tools) to connect to https://login.micrsoftonline.com throug an auth-enabled proxy?

Thanks in advance, best regards, Bence FEJÉRVÁRI

steveloughran commented 4 years ago

adls:// uses the normal JVM proxy settings, including those for proxy auth. Start by getting coudstore to work with them, then go on to add them as params for distcp, which is troublesome but doable.

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.1.5/bk_cloud-data-access/content/ch04s05.html

FWIW the storediag code was written to deal with adls proxy issues -the MSFT SDK doesn't check return content type on oauth authentication, and was trying to parse the corporate error html "use the proxy" as JSON and failing.