Azure / azure-sdk-for-java

This repository is for active development of the Azure SDK for Java. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/java/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-java.
MIT License
2.34k stars 1.98k forks source link

[FEATURE REQ] Managed Identity Support for Azure Synapse spark notebooks/jobs #40701

Closed jenilChristo closed 4 months ago

jenilChristo commented 4 months ago

Is your feature request related to a problem? Please describe. The Azure Cosmos Spark library lacks support for Spark jobs running in Azure Synapse, although Synapse supports linked service creation for Azure Cosmos DB.

Describe the solution you'd like Need a solution that can leverage the linked service created for a cosmos db account in the synapse and make use of it in the connector to connect to cosmos db in spark notebooks and jobs

 "spark.cosmos.auth.type" -> "ManagedIdentity"
  "spark.cosmos.auth.linkedService" -> "My linked Service for comos db"

Describe alternatives you've considered

Additional context 1) provided the managed identity rbac needed and created a linked service successfuly 2) opened a synapse notebook with azure cosmos spark library installed 2) added the spark.cosmos.auth.type as ManagedIdentity and provided the manged identity object id in clientid

image

Information Checklist Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report

github-actions[bot] commented 4 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @kushagraThapar @pjohari-ms @TheovanKraay.

FabianMeiswinkel commented 4 months ago

This is a limitation in Azure Synapse - not the Cosmos Spark connector - Neither the Synapse Token Library / LSR allows retrieving managed identity tokens for Cosmos DB Linked Services nor does Azure Synapse provide the workspace's managed identity to Spark Executors (like Azure Databricks does). So, until Azure Synapse addresses this limitation the only options are to use ServicePrincipal or switch to Azure Databricks (which supports authentication with Managed Identity).