Open minico-dev opened 12 months ago
it's possible with PAT, for example.
It's really not a terraform issue, but a product - we can't document every specific limitation in the terraform docs
The referenced solution still relies on generating a PAT token. As the Microsoft documentation mentions, this is not possible for Service Principals. The PAT would need to be a token manually generated by on a user account. The job would be ran as the SP in Databricks, but it would still be depending on a user's PAT to checkout the repository.
I agree that this is a product issue, but it I think it would be a nice addition to the documentation as it is not mentioned anywhere.
In AWS, the way we got around this is:
databricks_service_principal_secret
resourceprovider "databricks" {
alias = "job_sp"
host = var.databricks_workspace_host
client_id = databricks_service_principal.job_sp.application_id
client_secret = databricks_service_principal_secret.sp_secret.secret
account_id = var.databricks_account_id
}
resource "databricks_git_credential" "sp_git_credential" {
provider = databricks.job_sp
....
depends_on = [databricks_service_principal_secret.sp_secret]
}
This then effectively gives the SP git credentials access to the repo. But it feels very not nice
, so ideally we'd have a way of doing this without having to call the API/provider with the SP credentials.
@benwhelankf this sounds like an interesting workaround. Our git provider over here is azureDevOpsServices and the Service Principal we want to use for Jobs is Databricks managed. I tend to say we could not have a user in Azure DevOps representing this SP so that with the respective databricks_git_credential
resource we would be able to perform a succesful authentication. What is your git provider if I may ask and have you created a user in it to represent the Databricks SP ?
Affected Resource(s)
databricks_job databricks_git_credential
Expected Details
I have used Terraform to create a Databricks Job in my workspace. Without explicitly specifying the
run_as
block in the job specification, the job is run by the Service Principal that was used for creating the job through Terraform. It is also possible to explicitly specify a Service Principal for therun_as
parameter. However, there seems to be no way for such an account to obtain an Azure DevOps PAT to use in their AzureDevOpsServices git_credentials. It is only possible for them to create an Azure AD token (see included Factoids below). This token usually has a short lifetime and will not work as a static token in git credentials, because it would require a new token for every interaction with the repo. It is therefore not possible for a Service Principal to run any job that includes running code sourced from a Azure DevOps Git Repository. The job will fail with an error that it does not have permission to checkout the Git repository. This limitation is not mentioned anywhere in either thedatabricks_job
ordatabricks_git_credential
resources.List of things to potentially add/remove:
Important Factoids
This Microsoft acticle specifies that
Service principals can't create tokens, like personal access tokens (PATs) or SSH Keys. They can generate their own Azure AD tokens and these tokens can be used to call Azure DevOps REST APIs.
(located just above the FAQ section). The same article also includes a questionQ: Can I use a service principal to do git operations, like clone a repo?
to which the answer is to generate a (short lifetime) Azure AD token for git operations.