Azure / azure-kusto-spark

Apache Spark Connector for Azure Kusto
Apache License 2.0
77 stars 34 forks source link

DeviceAuthentication does not exist in the JVM on Databricks Runtime 14.3 LTS #379

Closed mfabina-cedes closed 2 months ago

mfabina-cedes commented 3 months ago

I'm trying to recreate the final snippet from the sample in this repository that uses DeviceAuthentication: https://github.com/Azure/azure-kusto-spark/blob/master/samples/src/main/python/pyKusto.py#L156

I'm executing it in an Azure Databricks notebook with compute running the 14.3 LTS Databricks Runtime Version (includes Apache Spark 3.5.0, Scala 2.12).

I have installed com.microsoft.azure.kusto:kusto-spark_3.0_2.12:5.0.7, as per the README's Databricks instructions.

However, whenever I try to reference the DeviceAuthentication class, I get the following error:

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/clientserver.py", line 541, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 1038, in send_command
    response = connection.send_command(command)
  File "/databricks/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/clientserver.py", line 564, in send_command
    raise Py4JNetworkError(
py4j.protocol.Py4JNetworkError: Error while sending or receiving
Py4JError: com.microsoft.kusto.spark.authentication.DeviceAuthentication does not exist in the JVM

In fact, the code to reproduce this is just a simple reference (although, copying the sample code in its completeness also reproduces this issue):

sc._jvm.com.microsoft.kusto.spark.authentication.DeviceAuthentication

I know that DeviceAuthentication exists in the spark_3.0_2.12:5.0.7 JAR, as you can simply unzip the archive and see the DeviceAuthentication.class file there.

What's the issue?

Thank you in advance!

mfabina-cedes commented 3 months ago

For anyone else having trouble trying to do this, my work-around was to simply use the azure-kusto-python library and manually convert any query responses to Spark DataFrames. Since I don't need any sort of scalability or performance for my current use-case, this is sufficient.

ag-ramachandran commented 2 months ago

For DeviceAuth to work it probably has to be a trusted device. Not very sure if ADB is a working scenario for DeviceAuth. It may work with a local notebook or with Synapse with 1P auth. Worth a check