Azure / azure-kusto-spark

Apache Spark Connector for Azure Kusto
Apache License 2.0
77 stars 34 forks source link

Writing data to an ADX with private endpoints #390

Closed alxy closed 2 months ago

alxy commented 2 months ago

We are operating in an environment where all azure resources are deployed with private endpoints activated, and public access disabled. Reading from a table using the connector works fine for, however, writing does not.

I found some existing issues that give some quite good background what is needed: https://github.com/Azure/azure-kusto-spark/issues/343#issuecomment-1785118752

Now my problem is that DNS in corporate settings is a mess and tricky to setup sometimes. Can you put somewhere in the docs which of the 8 different DNS endpoints that every ADX cluster requires is needed to successfully write data to it? Like all blob endpoints? Or also the table and queue endpoints?

ag-ramachandran commented 2 months ago

Hello @alxy , If you are talking of Private endpoints, you can look at the storages and IP addresses here

https://learn.microsoft.com/en-us/azure/data-explorer/security-network-private-endpoint#connect-to-a-private-endpoint

alxy commented 2 months ago

Thanks for your answer. We already do this, however, my question is if I need to setup all FQDNs to make the connector work, or only a sunset. We have setup DNS only for the main endpoint and the ingest one. I realized I need to add the blob endpoints as well as the queue endpoints as well. Question is if really all of them are strictly needed.

ag-ramachandran commented 2 months ago

Hello @alxy , for ingestion to work the all of these storage accounts, the ingestion- and the endpoints needs whitelisting

alxy commented 2 months ago

Thx, would you mind also mentioning this explicitly in teh docs somewhere? I think it would help - especially as I see more and more enterprises really using the private networking setup :)

ag-ramachandran commented 2 months ago

Hi @alxy , This is purely an internal call, though i understand your point. We many times work in changing how the ingestion work optimizing it , securing it etc. For example we are working in changing the SDK portions which will make the current way obsolete. So for now, we're not sure if we should add it in the documentation.