I am working with Azure Databricks. I have a cluster with credential passthrough which allows me to read data stored in ADLS gen2 using my own id. I can simply log into databricks workspace, attach a notebook to the cluster and query the delta tables from ADLS gen2 without any setup.
I would expect that when I submit dbx execute --cluster-id cluster123 --job jobABC to the same cluster, it should be able to read those datasets from ADLS gen2 using my ID.
Thanks!
Current Behavior
Currently, the job fails when I dbx execute a job to the cluster with the following error:
Py4JJavaError: An error occurred while calling o469.load.
: com.databricks.backend.daemon.data.client.adl.AzureCredentialNotFoundException: Could not find ADLS Gen2 Token
at com.databricks.backend.daemon.data.client.adl.AdlGen2UpgradeCredentialContextTokenProvider.$anonfun$getToken$1(AdlGen2UpgradeCredentialContextTokenProvider.scala:37)
at scala.Option.getOrElse(Option.scala:189)
at com.databricks.backend.daemon.data.client.adl.AdlGen2UpgradeCredentialContextTokenProvider.getToken(AdlGen2UpgradeCredentialContextTokenProvider.scala:31)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAccessToken(AbfsClient.java:1371)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.executeHttpOperation(AbfsRestOperation.java:306)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.completeExecute(AbfsRestOperation.java:238)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.lambda$execute$0(AbfsRestOperation.java:211)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfInvocation(IOStatisticsBinding.java:464)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:209)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:1213)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.AbfsClient.getAclStatus(AbfsClient.java:1194)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getIsNamespaceEnabled(AzureBlobFileSystemStore.java:437)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.getFileStatus(AzureBlobFileSystemStore.java:1107)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:901)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.getFileStatus(AzureBlobFileSystem.java:891)
From my understanding, it is expecting a service principal or storeage keys to be configured
Expected Behavior
I am working with Azure Databricks. I have a cluster with credential passthrough which allows me to read data stored in ADLS gen2 using my own id. I can simply log into databricks workspace, attach a notebook to the cluster and query the delta tables from ADLS gen2 without any setup.
I would expect that when I submit
dbx execute --cluster-id cluster123 --job jobABC
to the same cluster, it should be able to read those datasets from ADLS gen2 using my ID.Thanks!
Current Behavior
Currently, the job fails when I
dbx execute
a job to the cluster with the following error:From my understanding, it is expecting a service principal or storeage keys to be configured
Steps to Reproduce (for bugs)
dbx configure --token
to setup link with databricks workspaceconf/deployment.yml
file:dbx execute --cluster-id=cluster-id-with-credential-passthrough --job my-test-job
Context
I want to specifically "dbx execute" to my interactive cluster and not create a job cluster.
Your Environment
0.8.18
14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12)