groupon / sparklint

A tool for monitoring and tuning Spark jobs for efficiency.
Apache License 2.0
357 stars 92 forks source link

Unable to connect to history server hosted on Microsoft Azure #65

Open NileshGule opened 7 years ago

NileshGule commented 7 years ago

I am unable to add the history server link which is hosted on Microsoft Azure HDInsight cloud. I get the error

unexpected HTTP status: 401 Credentials missing or Auth is not Basic

Is there any way to provide the credentials?

roboxue commented 7 years ago

This is not an Application specific behavior either. (Simply no code is going to return a 401 here in this repo) It's likely being your Azure settings. I know some cloud service forbid certain ports by default, like AWS blocks almost every any port on EMR nodes.

You can provide a -p option to use an open port if you know it, otherwise contact your system admin's for this.

NileshGule commented 7 years ago

I have a Azure HDInsight cluster with public key access. I can connect to Ambari & other management UI as well as Spark history server using a browser & supplying the authentication parameters. How can I do the same when I want to add the link to History server using SparkLint UI?

roboxue commented 7 years ago

@NileshGule try modify these lines to add appropriate headers to see if this solves your problem. (I assume Azure uses header in requests to authenticate. (I can't find any doc online yet, and I'm not an Azure user to find out myself. You can probably look at the request using browser developer tool to see how the authentication was done when accessing HistoryServerUI.)

https://github.com/groupon/sparklint/blob/ce3e57005e1d48e1ef3dfff4819d664e179291e8/src/main/scala/com/groupon/sparklint/events/HistoryServerApi.scala#L84-L86