Closed GertVil closed 2 years ago
Hi @GertVil Thank you for reporting this issue. I am investigating the issue and I will report back here when I figure out what is going on.
@GertVil I think I see the problem. This is definitely a problem with the provider. I'll spend some time debugging this. v2.5.0
introduced a rather tricky mechanism for keeping track of server state in a generic way, and unfortunately it is failing for your use case, so I will likely need to make some adjustments.
For now, v2.4.0
has all of the latest features aside from the aforementioned changes.
Please stay tuned while I find a solution to this problem.
@GertVil if you have time, can you upgrade to version v.2.5.0 and tell me what the output of the query_response_input_key_map
property is on one of your resources? This will help me gain insight into ensuring I have proper fix for your issue.
Can you also tell me which terraform version you are using?
(edit): I believe the issue has to do with differences in how values are encoded between normal string inputs, and those that come from another resource's output. The error message:
was
cty.StringVal("gid://gitlab/Clusters::Agent/5754"), but now
cty.StringVal("gid://gitlab/Clusters::Agent/5758")
Indicates that output value changed somehow, but I am not quite sure how at this level.
@GertVil Apologies for the lot of messages as I discover more about this issue.
For v2.5.0 you have to ensure that your read_query
also returns values that you set on mutation_variables
. The reason you see the cluster_agent
project path
variable disappear is because you do not return it on your read query for cluster_agent
:
https://gitlab.com/nagyv-gitlab/kubernetes-agent-terraform-module/-/blob/main/register-agent.tf#L54
Please add the path variable there and try again (along with answering the other questions I've asked haha, thanks for working with me 😄 )
@GertVil I have not heard back from you on this issue. I plan to close the issue by the end of this week due to inactivity. Please let me know if this is still an issue for you. I want to make this provider as reliable as possible, and work for as many use-cases as possible. :)
So sorry for the late reply: Terraform version: 0.14.8
+ agent_token_query_response_input_key_map = {
+ "agent_id" = "project.clusterAgent.id"
+ "token_description" = ""
+ "token_name" = ""
}
+ cluster_agent_query_response_input_key_map = {
+ "agent_name" = "project.clusterAgent.name"
+ "project_path" = ""
}
I'm not a 100% sure on how to do what you're asking with adding the project_path variable, could you help more there?
@GertVil sure. So on the line of code I linked, you just need to add it to the graphql response like this:
read_query = <<EOT
query getAgent($agent_name: String!, $project_path: ID!) {
project(fullPath: $project_path) {
path
clusterAgent(name: $agent_name) {
id
name
}
}
}
notice how I now also return the path
property on the read_query.
NOTE: Im not actually sure what the structure of the response is from the API, so that may not be exact, but you just need to make sure that for anything you set in your
mutation_variables
, the read_query's response will include the value for those mutation variables somewhere in its response.
As a simpler example, say I have mutation variables that look like this:
mutation_variables = {
"name" = "someValueForName"
"some_other_property" = "someValueForThisProperty"
}
Then I would need to ensure my read query asks for the properties that would return the respective values for each of those mutation_variables. For example:
read_query = <<EOT
query getUserInfo($id: ID!) {
user($id) {
name
someOtherProperty
}
}
👆🏼 Then, the provider will essentially fuzzy find the matching values from your read query that are associated with your mutation variables. So, for example, if it finds a value of someValueForThisProperty
it will know that the someOtherProperty
property maps to the mutations variable: some_other_property
. I know this is slightly nuanced, but its the only way to ensure the provider can keep track of drift between remote state, and its own state.
Does this all makes sense? If not, Im always open to doing a zoom chat to walk through it and/or discord etc.
Thanks for the elaboration! I managed to make it work a bit more be renaming the references to project_path to fullPath otherwise the fuzzy wouldn't match but then the issue continues with
~ mutation_variables = {
~ "token_description" = "" -> "Token for KAS Agent Authentication"
~ "token_name" = "" -> "kas-token"
# (1 unchanged element hidden)
}
I did add the description and name so so:
read_query = <<EOT
query getToken($agent_name: String!, $fullPath: ID!) {
project(fullPath: $fullPath) {
name
fullPath
id
clusterAgent(name: $agent_name) {
id
tokens {
edges {
node {
id
description
name
}
}
}
}
}
}
EOT
but I'm guessing the fuzzyness doesn't match here too. Tried doing the same trick by renaming the fields but it still doesn't match
~ resource "graphql_mutation" "agent_token" {
id = "1028429438"
~ mutation_variables = {
~ "description" = "" -> "Token for KAS Agent Authentication"
~ "name" = "" -> "kas-token"
# (1 unchanged element hidden)
}
# (13 unchanged attributes hidden)
}
Maybe there's a way we could provide hints so the matching could work?
@GertVil Is there any way you can show me what the actual graphql read query responds with? (My example use of project_path
was mostly just a guess, since I dont know the structure of the api response).
I took a look over the API for Gitlab, and I now realize that the tokens[edges[nodes[]]]
is very deeply nested in the response.
It seems you are ultimately trying to get the name and description for a single node, is that correct? If so, I am wondering if you are able to use the ClusterAgentEdge query instead, which returns a single node of type ClusterAgentToken
.
I need to enhance the logic in the provider's controller to be able to find values from a deeply nested list of items (Unfortunately some GQL APIs only return lists, and do not allow single item queries). I will get to work on providing this enhancement, but in the meantime, if its possible, can you attempt to use a query that returns a single ClusterAgent, as opposed to a nested list of cluster agents?
@GertVil Great news, I was actually able to whip up a fix for this that should work for your current solution. I will need time to do a PR, write some more automated tests, and verify the solution. Once I have done all of that, Ill will cut a new release that should have you working on the latest version.
stay tuned.
A fix/enhancement PR was merged today. v2.5.1
will be automatically released within the next half hour, which should fix the issue presented here by ensuring that data values nested in array objects in the query response are properly represented by the server-state reconciler.
@GertVil Please upgrade to v2.5.1
and let me know if you experience any issues. I'm happy to help
Hi @sullivtr , we still have the same behavior with v2.5.1
Installing hashicorp/kubernetes v2.8.0...
- Installed hashicorp/kubernetes v2.8.0 (signed by HashiCorp)
- Installing hashicorp/helm v2.4.1...
- Installed hashicorp/helm v2.4.1 (signed by HashiCorp)
- Installing sullivtr/graphql v2.5.1...
- Installed sullivtr/graphql v2.5.1 (self-signed, key ID 271D1F8E7DF914[53](https://gitlab.com/xxxx/xxxx/terraform/terraform-example/-/jobs/2164677469#L53))
Partner and community providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://www.terraform.io/docs/cli/plugins/signing.html
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.
Terraform has been successfully initialized!
graphql_mutation.cluster_agent: Modifying... [id=704964686]
helm_release.rabbitmq: Destroying... [id=cluster-rabbitmq]
graphql_mutation.cluster_agent: Modifications complete after 0s [id=704964686]
helm_release.rabbitmq: Destruction complete after 2s
helm_release.rabbitmq: Creating...
helm_release.rabbitmq: Still creating... [10s elapsed]
helm_release.rabbitmq: Still creating... [20s elapsed]
helm_release.rabbitmq: Still creating... [30s elapsed]
helm_release.rabbitmq: Still creating... [40s elapsed]
helm_release.rabbitmq: Creation complete after 41s [id=cluster-rabbitmq]
╷
│ Error: Provider produced inconsistent final plan
│
│ When expanding the plan for graphql_mutation.agent_token to include new
│ values learned so far during apply, provider
│ "registry.terraform.io/sullivtr/graphql" produced an invalid new value for
│ .mutation_variables["agent_id"]: was
│ cty.StringVal("gid://gitlab/Clusters::Agent/89[56](https://gitlab.com/xxxx/xxxx/terraform/terraform-example/-/jobs/2164677469#L56)"), but now
│ cty.StringVal("gid://gitlab/Clusters::Agent/89[61](https://gitlab.com/xxxx/xxxx/terraform/terraform-example/-/jobs/2164677469#L61)").
│
│ This is a bug in the provider, which should be reported in the provider's
│ own issue tracker.
╵
this is an extract from the plan
Resource actions are indicated with the following symbols:
~ update in-place
-/+ destroy and then create replacement
Terraform will perform the following actions:
# graphql_mutation.agent_token will be updated in-place
~ resource "graphql_mutation" "agent_token" {
id = "3886449497"
~ mutation_variables = {
~ "agent_id" = "" -> "gid://gitlab/Clusters::Agent/8956"
~ "token_description" = "" -> "Token for KAS Agent Authentication"
~ "token_name" = "" -> "kas-token"
}
# (13 unchanged attributes hidden)
}
# graphql_mutation.cluster_agent will be updated in-place
~ resource "graphql_mutation" "cluster_agent" {
id = "704964686"
~ mutation_variables = {
~ "project_path" = "" -> "my_project/terraform-example"
# (1 unchanged element hidden)
}
# (13 unchanged attributes hidden)
}
in our case the definitions are based on https://gitlab.com/gitops-demo/infra/kas-gcp/-/blob/master/kas-agent.tf
@nadsat Can you send me documentation for the clusterAgent()
query that is being made in this block here: https://gitlab.com/gitops-demo/infra/kas-gcp/-/blob/master/gitlab-agent.tf#L89
Im having trouble finding it in the gitlab documentation. I need to gain a better understanding of that that is returning exactly.
With that said, as I mentioned before, its important to ensure that the read_query
for each mutation will return the values set by mutation_variables somewhere in its response.
For example, based on the setup of the cluster_agent
resource, your read query for that resource should look like the following:
read_query = <<EOT
query getAgent($agent_name: String!, $project_path: ID!) {
project(fullPath: $project_path) {
fullPath // Notice I am adding fullPath here, since your mutation variables set this, I want to include it in the query response
clusterAgent(name: $agent_name) {
id
name // You already include the name here, so this is already being populated. Notice your plan does not show `mutation_variables[agent_name]` being modified.
}
}
}
EOT
And the read query for the agent_token
mutation should look like this:
read_query = <<EOT
query getToken($agent_name: String!, $project_path: ID!) {
project(fullPath: $project_path) {
name
id
clusterAgent(name: $agent_name) {
id
tokens {
edges {
node {
id
name
description // Notice I am adding the name and description to the response here
}
}
}
}
}
}
EOT
As for why the agent_id
is not found, Its not clear to me yet why the agent id is not found in the query response. Just know that when you see a diff that looks like this "agent_id" = "" -> "gid://gitlab/Clusters::Agent/8956"
, it ultimately means that the provider did not find a value representing the agent_id in the query response. The weird thing is, the GitLab documentation only seems to show a clusterAgents
property being returned from Query.Projects
, and I cant find any specific clusterAgent
query in their documentation. If you know where it is and can share it, it will help me debug the underlying issue a little more easily.
@sullivtr I'm not sure if this is the documentation you asked for https://docs.gitlab.com/14.8/ee/api/graphql/reference/index.html#projectclusteragent
@nadsat is it possible that the cluster agent's ID changes at any time? Based on the logs you sent above, it appears that the cluster agent ID actually changed between the time of preview, and the time of apply somehow. Meaning, the provider queried the cluster agent and saw gid://gitlab/Clusters::Agent/8956
, but then during apply time the agent ID mutated to become gid://gitlab/Clusters::Agent/8961
somehow, which is ultimately what is confusing the provider
I am curious to see what your plan shows if you run it again. If it shows ~ "agent_id" = "" -> "gid://gitlab/Clusters::Agent/8961"
during the preview, that tells me that the ID did in fact mutate during the cluster_agent
resource's update.
We recreated the cluster from scratch, we run the plan twice and it worked
@sullivtr if it helps this was part of the plan the first time
# graphql_mutation.agent_token will be created
+ resource "graphql_mutation" "agent_token" {
+ compute_from_create = true
+ compute_mutation_keys = {
+ "secret" = "clusterAgentTokenCreate.secret"
+ "token_id" = "clusterAgentTokenCreate.token.id"
}
+ computed_delete_operation_variables = (known after apply)
+ computed_read_operation_variables = (known after apply)
+ computed_update_operation_variables = (known after apply)
+ create_mutation = <<-EOT
mutation createToken($agent_id: ClustersAgentID!, $token_name: String!, $token_description: String!) {
clusterAgentTokenCreate(input: {clusterAgentId: $agent_id, description: $token_description, name: $token_name}) {
secret
token {
createdAt
id
}
errors
}
}
EOT
+ delete_mutation = <<-EOT
mutation deleteToken($token_id: ClustersAgentTokenID!) {
clusterAgentTokenDelete(input: {id: $token_id}) {
errors
}
}
EOT
+ existing_hash = (known after apply)
+ id = (known after apply)
+ mutation_variables = (known after apply)
+ query_response = (known after apply)
+ query_response_input_key_map = (known after apply)
+ read_query = <<-EOT
query getToken($agent_name: String!, $project_path: ID!) {
project(fullPath: $project_path) {
name
id
clusterAgent(name: $agent_name) {
id
tokens {
edges {
node {
id
name
description
}
}
}
}
}
}
EOT
+ read_query_variables = {
+ "agent_name" = "gke-agent"
+ "project_path" = "xxxx/xxxx/terraform/terraform-example"
}
+ update_mutation = <<-EOT
mutation updateToken($token_id: ClustersAgentTokenID!, $agent_id: ClustersAgentID!, $token_name: String!, $token_description: String!) {
clusterAgentTokenDelete(input: {id: $token_id}) {
errors
}
clusterAgentTokenCreate(input: {clusterAgentId: $agent_id, description: $token_description, name: $token_name}) {
secret
token {
createdAt
id
}
errors
}
}
EOT
}
# graphql_mutation.cluster_agent will be created
+ resource "graphql_mutation" "cluster_agent" {
+ compute_from_create = false
+ compute_mutation_keys = {
+ "id" = "project.clusterAgent.id"
+ "name" = "project.clusterAgent.name"
}
+ computed_delete_operation_variables = (known after apply)
+ computed_read_operation_variables = (known after apply)
+ computed_update_operation_variables = (known after apply)
+ create_mutation = <<-EOT
mutation createAgent($project_path: ID!, $agent_name: String!) {
createClusterAgent(input: { projectPath: $project_path, name: $agent_name }) {
clusterAgent {
id
name
}
errors
}
}
EOT
+ delete_mutation = <<-EOT
mutation deleteAgent($id: ID!) {
clusterAgentDelete(input: {id: $id}) {
errors
}
}
EOT
+ existing_hash = (known after apply)
+ id = (known after apply)
+ mutation_variables = {
+ "agent_name" = "gke-agent"
+ "project_path" = "xxxx/xxxx/terraform/terraform-example"
}
+ query_response = (known after apply)
+ query_response_input_key_map = (known after apply)
+ read_query = <<-EOT
query getAgent($agent_name: String!, $project_path: ID!) {
project(fullPath: $project_path) {
fullPath
clusterAgent(name: $agent_name) {
id
name
}
}
}
EOT
+ read_query_variables = {
+ "agent_name" = "gke-agent"
+ "project_path" = "xxxx/xxxx/terraform/terraform-example"
}
+ update_mutation = <<-EOT
mutation updateAgent($id: ID!, $project_path: ID!, $agent_name: String!) {
clusterAgentDelete(input: {id: $id}) {
errors
}
createClusterAgent(input: { projectPath: $project_path, name: $agent_name }) {
clusterAgent {
id
name
}
errors
}
}
EOT
}
and the second time
# kubernetes_cluster_role.gitlab-agent-read has changed
~ resource "kubernetes_cluster_role" "gitlab-agent-read" {
id = "gitlab-agent-read"
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-agent-read"
# (3 unchanged attributes hidden)
}
~ rule {
+ non_resource_urls = []
+ resource_names = []
# (3 unchanged attributes hidden)
}
}
# kubernetes_cluster_role.gitlab-agent-write has changed
~ resource "kubernetes_cluster_role" "gitlab-agent-write" {
id = "gitlab-agent-write"
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-agent-write"
# (3 unchanged attributes hidden)
}
~ rule {
+ non_resource_urls = []
+ resource_names = []
# (3 unchanged attributes hidden)
}
}
# kubernetes_cluster_role_binding.gitlab-admin has changed
~ resource "kubernetes_cluster_role_binding" "gitlab-admin" {
id = "gitlab-admin"
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-admin"
# (3 unchanged attributes hidden)
}
# (2 unchanged blocks hidden)
}
# kubernetes_cluster_role_binding.gitlab-agent-cluster-admin has changed
~ resource "kubernetes_cluster_role_binding" "gitlab-agent-cluster-admin" {
id = "gitlab-agent-cluster-admin"
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-agent-cluster-admin"
# (3 unchanged attributes hidden)
}
# (2 unchanged blocks hidden)
}
# kubernetes_cluster_role_binding.gitlab-agent-read-binding has changed
~ resource "kubernetes_cluster_role_binding" "gitlab-agent-read-binding" {
id = "gitlab-agent-read-binding"
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-agent-read-binding"
# (3 unchanged attributes hidden)
}
# (2 unchanged blocks hidden)
}
# kubernetes_cluster_role_binding.gitlab-agent-write-binding has changed
~ resource "kubernetes_cluster_role_binding" "gitlab-agent-write-binding" {
id = "gitlab-agent-write-binding"
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-agent-write-binding"
# (3 unchanged attributes hidden)
}
# (2 unchanged blocks hidden)
}
# kubernetes_deployment.gitlab-agent has changed
~ resource "kubernetes_deployment" "gitlab-agent" {
id = "gitlab-agent/gitlab-agent"
# (1 unchanged attribute hidden)
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-agent"
~ resource_version = "[299](https://gitlab.com/zweicom/devops/terraform/terraform-example/-/jobs/2165923087#L299)8" -> "4525"
# (3 unchanged attributes hidden)
}
~ spec {
# (5 unchanged attributes hidden)
~ template {
~ metadata {
+ annotations = {}
# (2 unchanged attributes hidden)
}
~ spec {
+ node_selector = {}
# (11 unchanged attributes hidden)
~ container {
+ command = []
name = "agent"
# (8 unchanged attributes hidden)
# (2 unchanged blocks hidden)
}
# (1 unchanged block hidden)
}
}
# (2 unchanged blocks hidden)
}
}
# kubernetes_namespace.gitlab-agent has changed
~ resource "kubernetes_namespace" "gitlab-agent" {
id = "gitlab-agent"
~ metadata {
+ labels = {}
name = "gitlab-agent"
# (4 unchanged attributes hidden)
}
}
# kubernetes_namespace.gitops-apps has changed
~ resource "kubernetes_namespace" "gitops-apps" {
id = "gitops-apps"
~ metadata {
+ labels = {}
name = "gitops-apps"
# (4 unchanged attributes hidden)
}
}
# kubernetes_secret.gitlab-agent-token has changed
~ resource "kubernetes_secret" "gitlab-agent-token" {
id = "gitlab-agent/gitlab-agent-token"
# (3 unchanged attributes hidden)
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-agent-token"
# (4 unchanged attributes hidden)
}
}
# kubernetes_service_account.gitlab-admin has changed
~ resource "kubernetes_service_account" "gitlab-admin" {
id = "kube-system/gitlab-admin"
# (2 unchanged attributes hidden)
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-admin"
# (4 unchanged attributes hidden)
}
}
# kubernetes_service_account.gitlab-agent has changed
~ resource "kubernetes_service_account" "gitlab-agent" {
id = "gitlab-agent/gitlab-agent"
# (2 unchanged attributes hidden)
~ metadata {
+ annotations = {}
+ labels = {}
name = "gitlab-agent"
# (4 unchanged attributes hidden)
}
# (1 unchanged block hidden)
}
it seems our problem was might be related with the terraform cache,
@sullivtr thaks for your help
Interesting 🤔 @nadsat please let me know if you experience more related issues. Feel free to reopen this issue if that occurs.
I've been using the following terraform module: https://gitlab.com/nagyv-gitlab/kubernetes-agent-terraform-module. This has been working fine with 2.4.0 of the graphql provider, however whenever I upgraded to 2.5.0 the plan / apply show the following changes (this also fails on a "Clean" apply starting from 2.5.0 so not upgrading):
When applying these the following error occurs:
It also looks like the
module.gitlab_kubernetes_agent.graphql_mutation.agent_token
has been deleted on the Gitlab side. So I think the change first happens inmodule.gitlab_kubernetes_agent.graphql_mutation.cluster_agent
because the project path variable is empty somehow and the that triggers the delete / recreating of the token. Maybe this is related to the https://github.com/sullivtr/terraform-provider-graphql/pull/62 code change? For now, I can revert to using 2.4.0, but I'm wondering if this is an error in the graphql provider or if it's just incorrect usage of it.