Closed chriscrcodes closed 4 years ago
Hi @christophecremon, the resource azurerm_databricks_workspace
is part of the AzureRM provider and not this one. I don't think there's anything we can do in this provider to fix this issue. I suggest opening an issue on the terraform-provider-azurerm repo. However, I noticed similar behaviour.
I think that the issue comes from the Databricks provider, not able to find the Databricks workspace, as a prerequisite to create the cluster, and then creates it. A Terraform Plan command should not create any resource, so I think the problem comes from the Databricks provider. Can you please check the resource dependencies? Thanks!
@christophecremon Agree. I faced the same issue. It works only if Azure Databricks workspace is provisioned in advance.
This provider doesn't contain any code that can create workspaces. You can see in your logs that terraform-provider-databricks
immediately fails because the expected workspace has not been created. It can't create a cluster because it can't create the token that it needs to authorise that request.
From my understanding, to create a cluster, we must first create a Databricks workspace, using azurerm provider, and then invoke the Databricks provider to create the cluster, referencing the workspace that has been created with the azurerm provider? What I don't understand is why the Databricks workspace (pricing tier premium) is actually created, when using the terraform plan command, after the 404 error?
If we remove the databricks provider and databricks_cluster in the code, terraform plan will not create the workspace. So I'm guessing the databricks provider cause the issue?
Hey this is an issue caused by this provider, thank you for pointing it out. Identified the root cause being this https://management.azure.com/subscriptions/%s/resourceGroups/%s/providers/Microsoft.Databricks/workspaces/%s?api-version=2018-04-01 api
call. It is part of the azure_auth: getWorkspaceID. So converting this to generate the id will resolve the creation of the workspace issue.
@christophecremon
However it will still make your plans fail because the workspace does not exist.
So proposed solution is:
Previous:
func (a *AzureAuth) getWorkspaceID(config *service.DBApiClientConfig) error {
log.Println("[DEBUG] Getting Workspace ID via management token.")
// Escape all the ids
url := fmt.Sprintf("https://management.azure.com/subscriptions/%s/resourceGroups/%s"+
"/providers/Microsoft.Databricks/workspaces/%s?api-version=2018-04-01",
urlParse.PathEscape(a.TokenPayload.SubscriptionID),
urlParse.PathEscape(a.TokenPayload.ResourceGroup),
urlParse.PathEscape(a.TokenPayload.WorkspaceName))
payload := &WorkspaceRequest{
Properties: &WsProps{ManagedResourceGroupID: "/subscriptions/" + a.TokenPayload.SubscriptionID + "/resourceGroups/" + a.TokenPayload.ManagedResourceGroup},
Name: a.TokenPayload.WorkspaceName,
Location: a.TokenPayload.AzureRegion,
}
headers := map[string]string{
"Content-Type": "application/json",
"cache-control": "no-cache",
"Authorization": "Bearer " + a.ManagementToken,
}
var responseMap map[string]interface{}
resp, err := service.PerformQuery(config, http.MethodPut, url, "2.0", headers, true, true, payload, nil)
if err != nil {
return err
}
log.Println("RESPONSE DEBUG WORKSPACE ID")
log.Println(string(resp))
err = json.Unmarshal(resp, &responseMap)
if err != nil {
return err
}
a.AdbWorkspaceResourceID = responseMap["id"].(string)
return nil
}
Edited: Proposed change to GET request rather than PUT:
func (a *AzureAuth) getWorkspaceID(config *service.DBApiClientConfig) error {
log.Println("[DEBUG] Getting Workspace ID via management token.")
// Escape all the ids
url := fmt.Sprintf("https://management.azure.com/subscriptions/%s/resourceGroups/%s"+
"/providers/Microsoft.Databricks/workspaces/%s",
urlParse.PathEscape(a.TokenPayload.SubscriptionID),
urlParse.PathEscape(a.TokenPayload.ResourceGroup),
urlParse.PathEscape(a.TokenPayload.WorkspaceName))
headers := map[string]string{
"Content-Type": "application/json",
"cache-control": "no-cache",
"Authorization": "Bearer " + a.ManagementToken,
}
type apiVersion struct {
ApiVersion string `url:"api-version"`
}
uriPayload := apiVersion{
ApiVersion: "2018-04-01",
}
var responseMap map[string]interface{}
resp, err := service.PerformQuery(config, http.MethodGet, url, "2.0", headers, false, true, uriPayload, nil)
if err != nil {
return err
}
err = json.Unmarshal(resp, &responseMap)
if err != nil {
return err
}
a.AdbWorkspaceResourceID = responseMap["id"].(string)
return err
}
This was oversight on my part.
This should prevent the management api from creating the workspace, and rather just fail.
@stikkireddy Thanks! What is your advice then on how to automate Databricks workspace creation and cluster creation in a DevOps pipeline? We should be able to add a dependency between the resources, right? Or move Terraform files in two different directories (one for the workspace, without the Databricks provider and one for the cluster, referencing the workspace created, and using the Databricks provider)? Any idea?
@christophecremon @stuartleeks so I mulled over this a bit more in terms of what the problem is and how terraform "configures" providers. So I came up with a nifty solution using pointers and mutexes. So currently we pass a value of the client to the resource which is generated by the configure function. Rather than doing this, it may be better to pass along a pointer. This way we do not need to needlessly generate client objects per call.
This seems to work normally when testing. All attributes of the client are not altered in any shape or form except for the token. There will now be a new "custom" function attribute to the DBApiClientConfig with the signature CustomAuthorizer func(*DBApiClientConfig) error
.
The Provider configure function will be returning a pointer to the client so all resources access the same client:
var dbClient service.DBApiClient
dbClient.SetConfig(&config)
return &dbClient, nil
And an additional function:
func (c *DBApiClientConfig) Authenticate() error {
if c.CustomAuthorizer != nil {
clientAuthorizerMutex.Lock()
defer clientAuthorizerMutex.Unlock()
if reflect.ValueOf(c.Token).IsZero() {
return c.CustomAuthorizer(c)
}
return nil
}
return nil
}
calling authenticate in the api
// Groups returns an instance of GroupsAPI
func (c *DBApiClient) Groups() GroupsAPI {
err := c.Config.Authenticate()
if err != nil {
panic(err)
}
return GroupsAPI{Client: c}
}
This can be called every time the "APIObjects" are called. This allows plans to not throw errors even if the workspace does not exist but it will throw runtime errors if the workspace does not exist. This is because during the api calls, the api object creation will call the authorize function. The reason behind the mutex is because all the resources objects will be referring to the single client pointer rather than individual copies. So we do not want any race conditions when they call the authenticate function in parallel so hence the mutex.
Did a dry run and everything seems to be working.
@stikkireddy Thanks, can we test it?
@christophecremon apologies that this has a been a bit slow, was on vacation. I should have a PR for this later tomorrow (EST).
Hey @stikkireddy - we've just hit this issue in the work we're doing. If you have a branch with the PR work in we can give it a spin if that helps?
Hi there,
Terraform Version
Affected Resource(s)
Terraform Configuration Files
Debug Output
https://gist.github.com/christophecremon/f839ac7b0f342d277f0ceaba2fbab165
Expected Behavior
Terraform generates an execution plan. Terraform will perform the following actions, upon apply command has been executed:
Actual Behavior
An error is generated: Error 404 The workspace with resource ID /subscriptions//resourceGroups/databricks-rg/providers/Microsoft.Databricks/workspaces/databricks-demo-workspace could not be found.
Terraform actually creates the Databricks workspace, with a Premium Pricing Tier, even if terraform apply command has not been executed.
Steps to Reproduce
Please list the steps required to reproduce the issue, for example:
terraform plan