Closed divyavanmahajan closed 1 year ago
Hello @divyavanmahajan
You will have to use the version 4.0.x of the spark connector. You can refer our comments on this GH issue.
This was opened as well and the user marked this as fixed when he tested with it too. Unable to use connector if uri ends with kusto.fabric.microsoft.com · Issue #323 · Azure/azure-kusto-spark · GitHub
Why : This is because there are a set of "WellKnown" Kusto endpoints. *.fabric. was not one of them till versions 3.x series of the connector. This was done on versions 4 and up.
The current problem with 4.x of the connector is that it has to run JDK 11 (we are trying to mitigate it as well to make it JDK8 compat, as many customers complained of not being ready to migrate.) This is 1 week out at most
Note :
You have to set databricks to use JDK 11 as in the image below
use the env var: JNAME=zulu11-ca-amd64
Hi @ag-ramachandran, seeing this issue even after making above changes - All purpose compute.
But with jobs compute clusters it works as expected.
Config:
Library:
Tried with Maven coordinates and by uploading JAR
Cluster:
Hello @ravikiransharvirala If you have a full stack trace that would be great. Please note that we are also planning to release a JAVA8 compat for this sometime in the next week. That will make it simpler
Other questions include, was the compute restarted, post install of the jars? Just to eliminate another possibility. do you have any VNET/Firewall rules outbound?
@ag-ramachandran please find the stack trace
Failed to execute query. Error : Can't communicate with '<cluster-name>.z2.kusto.fabric.microsoft.com' as this hostname is currently not trusted; please see https://aka.ms/kustotrustedendpoints
Thank you, Yes, waiting for that update.
Tried restart, new cluster signup, post jar install execution . It didn’t work. I got the same error message.
Re: firewall, none that I know off.I don’t think so cause it works fine with databricks jobs.
Hi @ravikiransharvirala and @divyavanmahajan
https://mvnrepository.com/artifact/com.microsoft.azure.kusto/kusto-spark_3.0_2.12/5.0.0 is a new version that is released that has JDK8 compat and works with Fabric as well. Please give it a try and let us know
cc: @asaharn
hi @ag-ramachandran,
Appreciate you for following up on this.
I updated my cluster to the latest version but now I see different error while trying to query ADX fabric cluster.
java.lang.NoClassDefFoundError: com/microsoft/aad/msal4j/ClientCredentialFactory
`---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
File
File /databricks/spark/python/pyspark/instrumentation_utils.py:48, in _wrap_function.
File /databricks/spark/python/pyspark/sql/readwriter.py:309, in DataFrameReader.load(self, path, format, schema, **options) 307 return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path))) 308 else: --> 309 return self._df(self._jreader.load())
File /databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py:1321, in JavaMember.call(self, *args) 1315 command = proto.CALL_COMMAND_NAME +\ 1316 self.command_header +\ 1317 args_command +\ 1318 proto.END_COMMAND_PART 1320 answer = self.gateway_client.send_command(command) -> 1321 return_value = get_return_value( 1322 answer, self.gateway_client, self.target_id, self.name) 1324 for temp_arg in temp_args: 1325 temp_arg._detach()
File /databricks/spark/python/pyspark/errors/exceptions.py:228, in capture_sql_exception.
File /databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name) 324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client) 325 if answer[1] == REFERENCE_TYPE: --> 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". 328 format(target_id, ".", name), value) 329 else: 330 raise Py4JError( 331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n". 332 format(target_id, ".", name, value))
Py4JJavaError: An error occurred while calling o423.load.
: java.lang.NoClassDefFoundError: com/microsoft/aad/msal4j/ClientCredentialFactory
at com.microsoft.azure.kusto.data.auth.TokenProviderFactory.createTokenProvider(TokenProviderFactory.java:28)
at com.microsoft.azure.kusto.data.ClientImpl.
hI @ravikiransharvirala,
This is strange, somehow it is not able to find the class that is used by one of the dependency by Kusto spark connector.
To avoid this you can try out one of the below two:
Please let us know if this works for you.
@asaharn Sorry for the delay here. It didn't work. I uploaded the latest Jar to the cluster and tested it.
Failed to execute query. Error : Can't communicate with 'z2.kusto.fabric.microsoft.com' as this hostname is currently not trusted; please see https://aka.ms/kustotrustedendpoints
@ravikiransharvirala please set up a working session by sending an email to ramacg at ms.
Describe the bug Fabric Kusto databases have the url "https://trd3bep6cbtfa821kx6hfa.z5.kusto.fabric.microsoft.com/" When using the Spark Connector for Azure Data Explorer with a Fabric KQL database
If cluster="trd3bep6cbtfa821kx6hfa.z5" , we get an error DataServiceException: IOError when trying to retrieve CloudInfo Caused by: UnknownHostException: trd3bep6cbtfa821kx6hfa.z5.kusto.windows.net: Name or service not known
The Spark driver assumes the suffix ".kusto.windows.net" and does not find the cluster.
If cluster="https://trd3bep6cbtfa821kx6hfa.z5.kusto.fabric.microsoft.com/" , we get the error Can't communicate with 'trd3bep6cbtfa821kx6hfa.z5.kusto.fabric.microsoft.com' as this hostname is currently not trusted; please see https://aka.ms/kustotrustedendpoints
To Reproduce
Expected behavior The Query should run and return data.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Additional context Add any other context about the problem here.