Closed ChingChuan-Chen closed 4 months ago
Please refer : https://github.com/Azure/azure-kusto-spark/issues/385 and https://github.com/Azure/azure-kusto-spark/issues/390 and see if these provide pointers.
Thank you. It seems that I can do nothing if I would like to continues using this library. Because the Kusto cluster is not under our management, the storage account is blocked by network rules. Also, the SAS token is forbidden to use by the compliance, so the transient storage is not working for us as well.
I think that I need to re-invent the wheel to partition the data with hash
in the query by myself and read into Spark.
Hello @ChingChuan-Chen, Thanks for the comment. So if I understand it right you want to bring in your own storage for ingestion ? Is that a fair understanding ?
There are a specific set of storages that are used per Kusto cluster ( they do not change in the lifetime of the cluster ), I can provide a walkthrough of it if needed.
Thank you. It seems that I can do nothing if I would like to continues using this library. Unfortunately, this is the way the connector was designed, a lot of customers who use the lib do not want seperate storage (except in case of read where they want to reuse data) , ingestion should work as-is The library just uses the Queued Ingestion where it uses the blob to ingest data.
If you have a question , happy to take it forward . you can reach me at ramacg at ms dot com
Describe the bug As my understanding, when data is large, Kusto Spark Connector will create a temporary storage to write csv file. But in my case, it somehow can't reach the created storage account. It will raise the exception: reactor.core.Exceptions$ReactiveException: java.net.UnknownHostException: *****.blob.core.windows.net: Name or service not known
To Reproduce I am not sure the reason it happened. The Kusto cluster is in other subscription which enables only allowing VPN to connect to. And our Synapse is in a private virtual network.
Expected behavior It should be okay to read the data through Kusto Spark Connector.
Desktop (please complete the following information):
Additional context Spark 3.4 with "com.microsoft.azure.kusto" %% "kusto-spark_3.0" % "5.0.8".