Closed jjorissen52 closed 4 years ago
I've been noticing the same error across many different projects as of today:
For example, this config is causing this error:
Step #0 - "prepare": Error: Batch "iam-project-ci-gcloud-b081 modifyIamPolicy" for request "Create IAM Members roles/owner serviceAccount:ci-account@ci-gcloud-b081.iam.gserviceaccount.com for \"project \\\"ci-gcloud-b081\\\"\"" returned error: Error applying IAM policy for project "ci-gcloud-b081": Error setting IAM policy for project "ci-gcloud-b081": googleapi: Error 400: Policy members must be of the form "<type>:<value>"., badRequest
Step #0 - "prepare":
Step #0 - "prepare": on iam.tf line 29, in resource "google_project_iam_member" "int_test":
Step #0 - "prepare": 29: resource "google_project_iam_member" "int_test" {
Step #0 - "prepare":
The error is quite confusing, because serviceAccount:ci-account@ci-gcloud-b081.iam.gserviceaccount.com
looks valid as an IAM member to me.
I think the right fix is likely to filter out deleted principles when sending the IAM policy back.
I've been doing a bit more investigation into this (tracked in #333). I've been able to consistently reproduce it on my project, here are the debug logs.
Looking at the logs, I suspect the issue is related to deleted IAM principles. Specifically, I see that we attempt to reflect a deleted IAM principle back in the setPolicy response.
I've also done some version testing:
Right now the best workaround I can find is to pin the provider to ~> 2.12.0
.
I've got a fix for this on the way: https://github.com/GoogleCloudPlatform/magic-modules/pull/2819
As a workaround until the fix is released you can delete service account IAM members with the deleted:
prefix and terraform will work as usual.
This issue is caused specifically by deleted service accounts that exist on the resource that terraform is managing members on, so removing references to them will allow terraform to work normally.
This fix is available now in the 2.20.1
version of the provider, and will be available for 3.x in the 3.3.0 release expected next week.
@slevenick I've just attempted it after pinning v2.20.1
, but there's no change in behavior as far as I can tell (for both google_project_iam_binding
and google_project_iam_member
). Any advice for me?
Terraform v0.12.10
+ provider.archive v1.3.0
+ provider.google v2.20.1
+ provider.local v1.4.0
+ provider.null v2.1.2
@jjorissen52 can you provide debug logs for the failing run? That will help me debug what is going on
@slevenick unfortunately, earlier today I bumped up to v3.2.0
on this project for an unrelated reason, and I am unable to downgrade again (trying to do so results in an error with terraform apply
).
The 3.3.0 release is expected to go out tomorrow which has this fix. Please let me know if you encounter the same issue with that version, but I'll close this until then.
I believe this issue has been fixed with 2.20.1 as I am unable to reproduce issues at this point
Downgrading from 3.x to 2.x is going to be difficult and not recommended
I am definitely still encountering this issue with 2.20.1
, ~is it possible that version does not yet include the fix?~ nvm, i checked the tag, the fix should be in there.
Error: Batch "iam-project-demo modifyIamPolicy" for request "Create IAM Members roles/stackdriver.resourceMetadata.writer serviceAccount:staging-cluster-sa@demo.iam.gserviceaccount.com for \"p
roject \\\"demo\\\"\"" returned error: Error applying IAM policy for project "demo": Error setting IAM policy for project "demo": googleapi: Error 400: Request contains an invalid argument., b
adRequest
on .terraform/modules/gke_service_account/main.tf line 33, in resource "google_project_iam_member" "service_account-roles":
33: resource "google_project_iam_member" "service_account-roles" {
I also upgraded everything to 3.3.0
and I'm still seeing that issue, if I blow everything away and go back to 2.12.0
everything still seems to work
I have just tried this with version 3.4.0 and I am getting the same error, here's a code snippet:
resource "google_service_account" "cloud_sql" {
account_id = "dev-cloud-sql"
display_name = "dev-cloud-sql"
}
resource "google_project_iam_binding" "cloud_sql_iam" {
depends_on = [google_service_account.cloud_sql]
role = "roles/cloudsql.client"
members = [
"serviceAccount:${google_service_account.cloud_sql.email}"
]
}
Error output:
Error: Batch "iam-project-xxx modifyIamPolicy" for request "Set IAM Binding for role \"roles/cloudsql.client\" on \"project \\\"xxx\\\"\"" returned error: Error applying IAM policy for project "xxx": Error setting IAM policy for project "xxx": googleapi: Error 400: Request contains an invalid argument., badRequest
on ../../../modules/db_database/main.tf line 20, in resource "google_project_iam_binding" "cloud_sql_iam":
20: resource "google_project_iam_binding" "cloud_sql_iam" {
@madmaze or @lobsterdore can you include a debug log for the failed apply?
I am able to apply the config provided with 3.3.0
, but a debug log would help identify the issue
@slevenick , I just upgraded to v3.4.0 and can confirm that this is still affecting me. Debug Logs
Terraform v0.12.10
+ provider.archive v1.3.0
+ provider.google v3.4.0
+ provider.local v1.4.0
+ provider.null v2.1.2
terraform apply -target=module.booklawyer.module.etl.google_project_iam_binding.sql_client
Shows same error as before:
Error: Batch "iam-project-booklawyer-dev-259701 modifyIamPolicy" for request "Set IAM Binding for role \"roles/cloudsql.client\" on \"project \\\"booklawyer-dev-259701\\\"\"" returned error: Error applying IAM policy for project "booklawyer-dev-259701": Error setting IAM policy for project "booklawyer-dev-259701": googleapi: Error 400: Request contains an invalid argument., badRequest
on ../etl/iam.tf line 12, in resource "google_project_iam_binding" "sql_client":
12: resource "google_project_iam_binding" "sql_client" {
@jjorissen52 That is odd. Can you apply the same config on a new (clean) project?
I suspect that there is something strange happening with the IAM policy for your existing project. I believe this is an unrelated issue, but it presents with the same (not very helpful) error message.
Looking at the debug log, I would guess that this is causing the failure:
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5: {
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5: "role": "roles/owner",
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5: "members": [
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5: "user:",
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5: "user:",
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5: "user:",
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5: "user:"
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5: ]
Terraform receives an IAM policy that has a series of members named user:
from the API. To my eye this looks blatantly wrong, and using the iam_binding
resource within terraform attempts to preserve any existing members, so it posts the same series of user:
members back.
I believe that removing these faulty members will cause terraform to succeed. Could you try either using the console or gcloud to remove these members, or using a project_iam_policy
which is authoritative?
@slevenick Apologies, I manually modified those lines so as to not publish my co-workers email addresses. each of those lines once contained an valid-user@valid-domain.com
. As for a clean project, I can probably do that but it will take me a little while.
Ok that makes sense.
I'm back to being confused about why this is happening. This seems unrelated to the other issues around deleted:
IAM members, though it started occurring at the same time. It could possibly be related to changes in the IAM API that happened around the filing date of this issue
Were you able to successfully apply this config with versions of the provider after 2.12.0
prior to filing this issue?
What I'm trying to figure out is if this broke with the 2.13.0
release or if the combination of 2.13.0
+ and the API changes that happened around Dec 6th are causing it.
@slevenick I had never attempted this particular role assignment (roles/cloudsql.client
) using a resource "google_project_iam_binding" "" {}
block before on any version, but I do have a project that assigns a role which currently uses provider.google v2.16.0
.
resource "google_project_iam_binding" "cloudbuild-sa-user" {
project = "${google_project_services.project.project}"
role = "roles/iam.serviceAccountUser"
members = [
"serviceAccount:${local.cloud_build_sa}",
]
}
Unfortunately, I cannot tell if this is the version that was used when creating the binding or if I've since updated the version; the state history does not seem to contain information about provider versions.
I have a debug log of both v2.12.0
and v2.20.1
, are there any specific parts that would be most valuable to share? I'm hesitant to share the whole log, its full of seemingly sensitive info.
I've cleaned up two snippets, 2.12.0 & 2.20.1 which seem relevant to me. Looks like besides the order, the sent data is exactly the same besides the etag
(2.12.0 json & 2.20.1 json) which I'm not sure whether that's supposed to change.
https://gist.github.com/madmaze/ccda69be4ac861f6ac0fc15cdf9e8bf3
Two other differences seem to be in the headers:
X-Goog-Api-Client: gl-go/1.11.0 gdcl/20191007
@madmaze those are helpful logs, but they don't seem to indicate what the issue is.
The nearly identical request failing is very strange. The change in etag
between the two requests is expected as it is used for locking, and should change whenever the IAM policy is updated.
I'm asking around internally to try and track down an answer on this.
How are you resetting the IAM policy when you change the provider version? Could you include the config that you are using to reproduce this?
I don't have access to the actual files right now, but here is the order of operations I performed:
google_project_iam_member
blockgoogle_project_iam_member
blockI am also seeing this issue when applying iam_member with provider.google: version = "~> 3.4"
Error: Batch "iam-project-<project id> modifyIamPolicy" for request "Create IAM Members roles/storage.objectAdmin serviceAccount:<service-account-id>@<project-id>.iam.gserviceaccount.com for \"project \\\"<projet-id>\\\"\"" returned error: Error applying IAM policy for project "<project-id>": Error setting IAM policy for project "<project-id>": googleapi: Error 400: The role name must be in the form "roles/{role}", "organizations/{organization_id}/roles/{role}", or "projects/{project_id}/roles/{role}"., badRequest
role = "roles/storage.objectAdmin"
member = "serviceAccount:${module.module-name.email}"
In the debug logs, I am seeing this:
eval: *terraform.EvalMaybeTainted
@michyliao that looks like a different issue. Can you file a separate issue with debug logs included?
I'm unable to track this down by just the error message from the debug logs (invalid argument is very generic)
I'll probably need to be able to reproduce this to make further progress. @madmaze can you send me the full debug logs for a failing run? It would help to have the full request/response pair without any changes. If you don't want to post them publicly could you send them to my username @google.com
@slevenick It seems that, for the affected project, resource "google_project_iam_binding"
always fails to apply. Should I update the title to more accurately describe the issue?
Just today faced this bug and am very surprised that it's not fixed for months. After wasting several hours I found that member/binding functions fail when there is a user (in the project) with Capital letter(s) in its ID (email) Fortunately I had just 1 inactive user with Capital letters and I was able to remove it and apply my "google_project_iam_member" rules.
The error message " Error 400: Request contains an invalid argument., badReques" is misleading. As I wrote above the actual error is Capital letters in project user ID (actually in our case with "owner" permissions if that makes any change)
What's the most weird in this situation is that I can't add that user back with low case letters. Google checks the email I provide (lower case) in its user database(s) and adds it with Capital letters again.
Please fix. // Hope this message will save to someone his/her time
Hey @akrasnov-drv sorry that this caused issues for you.
How are you adding back the user with lower case letters? Can you give me an overview of your workflow, like are you using terraform to attempt to add this user back, but it gets sent as lowercase@mail.com
and comes back as LOWERCASE@mail.com
?
Hi @slevenick User creation is not actually relevant to the case. It's just another side effect that adds troubles. I created user in Google console (IAM). I specified lowercase useremail@gmail.com, and Google found it, but then it added the user as UserEmail@gmail.com (likely it was initially registered so in gmail by the user) The terraform google provider bug is that it can't work with such "unusually formatted" emails, and produces misleading error. I understand that RFC defines email addresses as case insensitive. But Google keeps it case sensitive, therefor google provider should support this too.
Hm, can you provide debug logs for the failing run? I'm unable to create a user with capital letters in their name. I have created a user with capital letters, but the IAM console only finds it as lowercase, which doesn't cause any issues.
Yes, sure. As I wrote before, Google provides the email it finds in its databases, and it keeps capital/lowercase as it's in its DB. I don't know if you can register new Google user with capital letters in email now, but it was definitely possible in the past.
Test code
account_id = "del-me"
display_name = "bug test sa"
}
resource "google_project_iam_member" "bug_test_role" {
role = "roles/compute.instanceAdmin"
member = "serviceAccount:${google_service_account.del_me.email}"
depends_on = [google_service_account.del_me]
}
Error
google_project_iam_member.bug_test_role: Creating...
Error: Batch "iam-project-my-project modifyIamPolicy" for request "Create IAM Members roles/compute.instanceAdmin serviceAccount:del-me@my-project.iam.gserviceaccount.com for \"project \\\"my-project\\\"\"" returned error: Error applying IAM policy for project "my-project": Error setting IAM policy for project "my-project": googleapi: Error 400: Request contains an invalid argument., badRequest. To debug individual requests, try disabling batching: https://www.terraform.io/docs/providers/google/guides/provider_reference.html#enable_batching
on security.tf line 19, in resource "google_project_iam_member" "bug_test_role":
19: resource "google_project_iam_member" "bug_test_role" {
The log (attached, with some security related masking) is for google-beta but it fails the same way for google too.
That's very unusual. How did you create the user with capital letters, is it just an old email that existed?
And you have found that removing the user with capital letters allows you to apply the binding?
I'll ask around for why the API would be returning upper case values and if this is intended we should handle this correctly in Terraform
There are enough complaints in Internet regarding these functions not working. I believe all (or most) of them have this issue (user(s) with Upper case letter(s)). This should be handled by terraform provider. I do not believe Google will update it user databases (or API)...
@jjorissen52 does your IAM policy have users with upper case letters?
I'm tracking down the intended behavior here, and will definitely handle this in the provider if needed
@slevenick The project does have one user with capital letters in the email, though none of bindings defined via terraform do anything with that user. Don't know if that makes a difference.
Yes, I also do nothing with the problem user. But you can see it in debug and it brakes the workflow (I mean just existence of it). @josephlewis42 if you have an option to (temporary) remove that user, you'll see it fixes your terraform processing.
@akrasnov-drv @slevenick That was it.
@akrasnov-drv thank you for figuring out the root cause of this issue!
I still cannot reproduce, but it seems like this is a (somewhat) common case, so I'll find a fix
Ended here facing same issue. I was using google_project_iam_member
as
foo@xxx.iam.gserviceaccount.com
fiexed using:
serviceAccount:foo@xxx.iam.gserviceaccount.com
It's in doc anyway.
I'm still having trouble reproducing this issue, and I believe that there is something strange going on with the particular emails being used here as emails are not handled case sensitively by the API.
Can I have one of you @akrasnov-drv or @jjorissen52 send me the actual email that is causing the problems? It will help me track down what exactly about these users is causing the issue.
You can send it to my github username @google.com
Hi, Have you seen email I sent you about a week ago? Any progress?
@slevenick
I've hit the same issue today running terraform gke public module
I believe that the issue happens when attempting to add a role to a new service account (existing policy), you have to first fetch the policy which includes the user with the capital letter, then append to it and apply it.
If you can point me to the code where this is done I can try to replicate it using gcloud CLI, and see if its an SKD issue or implementation issue (usually the SDK will make fixes to it before applying it)
update:
Im unable to replicate it on a single role, already containing a CamelCase user name, maybe its an issue with size of the payload?
resource "google_service_account" "sa" {
account_id = "terratest"
display_name = "Terratest Service Account"
}
resource "google_project_iam_member" "log_writer" {
project = "ami-playground"
role = "roles/logging.logWriter"
member = "serviceAccount:${google_service_account.sa.email}"
}
Surprisingly I'm unable to reproduce this issue in my own project. If I add a user with a capital letter, it behaves the same way as in all of the cases described here, where Terraform lowercases any capital letters coming from the API, but in all of my cases the API accepts the lowercase version.
For example, the API will return:
"role": "roles/browser",
"members": [
"user:MyUser@gmail.com"
],
I add a binding with a different user, posting back a policy with
"role": "roles/browser",
"members": [
"user:myuser@gmail.com"
],
Which the API accepts and automatically corrects and returns MyUser
in the future.
I'm trying to debug with the team internally, and may reach out to some of you for help in reproducing this for them
@slevenick Thank you for the efforts :) Try using the user I sent you by mail. In my project it breaks binding functions with 100% consistency. I added and removed it already about 5-7 times. In my project this user has "owner" rights if it changes anything.
I was just experiencing what seems like a related issue to this and #4276 and was able to solve it. Maybe this can help others in the thread.
I have a resource "google_project_iam_custom_role"
, a data "google_iam_policy"
(not certain this is required), and a resource "google_project_iam_member"
. The API was returning the error googleapi: Error 400: Role roles/myCustomRole is not supported for this resource., badRequest
when trying to create the google_project_iam_member
.
Returned the badRequest
error:
resource "google_project_iam_member" "mem" {
role = "roles/${google_project_iam_custom_role.role.role_id}"
member = "serviceAccount:${data.google_service_account.sa.email}"
}
Succeeded:
resource "google_project_iam_member" "mem" {
role = "projects/${var.project}/roles/${google_project_iam_custom_role.role.role_id}"
member = "serviceAccount:${data.google_service_account.sa.email}"
}
Yes, #4276 is related, and @danawillow has a working reproduction of this issue, so hopefully we should get it fixed soon!
I'll close this as a duplicate at this point as #4276 is the same issue
I'm going to lock this issue because it has been closed for 30 days ā³. This helps our maintainers find and focus on the active issues.
If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error š¤ š , please reach out to my human friends š hashibot-feedback@hashicorp.com. Thanks!
Community Note
Terraform Version
Affected Resource(s)
Terraform Configuration Files
Debug Output
https://gist.github.com/jjorissen52/d253d274cdb763b47b55cbe3ee0f19e2
Expected Behavior
Binding should happen
Actual Behavior
Steps to Reproduce
terraform apply
Important Factoids
I have been able to use this exact resource setup to apply other roles to other service accounts.
References
Resolution here does not seem to work: