Azure / azure-kusto-spark

Apache Spark Connector for Azure Kusto
Apache License 2.0
77 stars 34 forks source link

no option to pass in the appId/appKey with the API call for authentication in Synapse #340

Closed tritm78 closed 1 year ago

tritm78 commented 1 year ago

Describe the bug Per this doc, I can read data from kusto with linkedService option (for linkedService setting, I used appId/appKey/tenant):

kustoDf = spark.read \ .format("com.microsoft.kusto.spark.synapse.datasource") \ .option("spark.synapse.linkedService", "") \ .option("kustoDatabase", "") \ .option("kustoQuery", "") \ .load()

however one problem is that I will need to grant access for the Synapse workspace appId in the Kusto cluster. Is there an option to specify the appId and appKey in the API call, so that this appId (instead of the synapse workspace appId) will be used to access data in the Kusto cluster, something like this:

kustoDf = spark.read \ .format("com.microsoft.kusto.spark.synapse.datasource") \ .option("spark.synapse.linkedService", "") \ .option("aadAppId", ). \ .option("aadAppSecret", ). \ .option("aadTenantId", ). \ .option("kustoDatabase", "") \ .option("kustoQuery", "") \ .load()

ag-ramachandran commented 1 year ago

Hi @tritm78

Here is one way we can do it.

image


pip install msal

from msal import ConfidentialClientApplication
app = ConfidentialClientApplication(
    "app-id", #App Id
    client_credential="app-key", #App key
    authority="https://login.microsoftonline.com/<tenant>") # Tenant
scopes = ["<cluster>/.default"]
result = app.acquire_token_for_client(scopes=scopes)

kustoDfAad  = spark.read \
    .format("com.microsoft.kusto.spark.synapse.datasource") \
    .option("kustoDatabase", "spark") \
    .option("accessToken", result["access_token"]) \
    .option("kustoCluster", "<cluster>") \
    .option("kustoQuery", "KustoSparkReadWriteTest_2e8d63e1_3ce2_466f_879a_9dd9465c46d5 | take 100") \
    .load()

display(kustoDfAad)

Note that AAD based auth directly seems to have an issue with MSAL classpath. Will try and address it in a subsequent release

cc: @asaharn

tritm78 commented 1 year ago

Thanks @ag-ramachandran! That worked.

Please keep me updated when the AAD based auth can work directly with the API call.