Explicitly address org policy requirements (Closes #158)

bbhuston commented 1 year ago

@markmandel

This closes #158. Once you have had an opportunity to test this locally give it the LGTM, I can get it caught up with any future changes on the main line and merged in.

markmandel commented 1 year ago

Nice!

🤔 how does one test this? 😃 My Projects have default permissions, so I don't run into this. Maybe someone else should test this?

Also - we should link to this from somewhere in the main README.md so it's discoverable as well.

bbhuston commented 1 year ago

Nice!

🤔 how does one test this? 😃 My Projects have default permissions, so I don't run into this. Maybe someone else should test this?

Also - we should link to this from somewhere in the main README.md so it's discoverable as well.

Mark -- How did you test org policies before? If I recall correctly, in an offline chat you mentioned that you ran into API 'not ready yet' issues while testing a prior iteration of this PR.

If you can apply org policies, simply running through the troubleshooting README (in particular the terraform apply step) should allow you to confirm that the resources get created and be a good first order approximation. I have also request someone else to double-check as well. (@mbychkowski Would you mind checking out this PR and running through the README once?)

As you suggested, I will also be sure to push up another change that makes this discoverable in the main README's troubleshooting section.

mbychkowski commented 1 year ago

Sure! I can give this a review today

markmandel commented 1 year ago

@mbychkowski thanks! My google project is kinda locked down at the moment, since I'm presenting at Kubecon next week 😄 and don't want to potentially break it! 😄

mbychkowski commented 1 year ago

A larger discussion might be worth having on how we want to separate concerns on terraform infra management. One idea is by stages, I really like what the fast fabric team has put together or by persona: org-admin [game publishers], security team, infra/net team, data team, AAA studio dev 1, AAA studio dev 2, etc...

I only bring this up because the this org-policy kind of touches on how we get to a "golden" gcp landing page quickly before we can actually start doing cool stuff! or should it always just be bundled together 💭

mbychkowski commented 1 year ago

Sorry did not mean to close this!

markmandel commented 1 year ago

Mark -- How did you test org policies before? If I recall correctly, in an offline chat you mentioned that you ran into API 'not ready yet' issues while testing a prior iteration of this PR.

Yeah, we've resolved those by using:

depends_on = [google_project_service.project]

Which waits on the APIs to become properly enabled before moving forward.

For example: https://github.com/googleforgames/global-multiplayer-demo/blob/b97a62feebc4c59e62e9b32f2dc6f097b004151f/infrastructure/agones-gke.tf#L46

bbhuston commented 1 year ago

overall LGTM to pull in. Granted it doesn't interrupt @markmandel Kubecons talk! One point of change would be making policy_root a variable to change between organization, folder and project if we go outside the scope of everything in one project

Also just stating my understanding:

I'm viewing this as a module that would be pulled into the main.tf. Something like this
module org-policy {
  source = "./org-policy
  project = "${var.project_id}"
  gcp_project_services = "${var.services}"
}
I like this idea.

Or is it meant to be ran as a bootstrap stage of itself, before the other TF infrastructure?

@mbychkowski

I'm a fan of the module approach, which is the approach we first tried a month or so ago. However, Mark (wisely) suggested that we completely split the creation of org policy resources into a separate folder/TF state file because many users would not have IAM permissions to edit org policies, so putting this in the resource creation critical path would block them from using the rest of the demo provisioning logic. Put another way, many users will just have a preconfigured project handed to them by a platform team and they need to be able to run terraform free and clear at the project-scoped resource level.

@markmandel Has your position evolved on this since we last discussed this topic?

bbhuston commented 1 year ago

Mark -- How did you test org policies before? If I recall correctly, in an offline chat you mentioned that you ran into API 'not ready yet' issues while testing a prior iteration of this PR.

Yeah, we've resolved those by using:
depends_on = [google_project_service.project]
Which waits on the APIs to become properly enabled before moving forward.

For example:

https://github.com/googleforgames/global-multiplayer-demo/blob/b97a62feebc4c59e62e9b32f2dc6f097b004151f/infrastructure/agones-gke.tf#L46

Understood. And we have a similar setup for the creation of org policies in this PR to solve for that.

My question was more around that you had run an org-level test run in this area before and couldn't that just be done again. But as you flagged, you have a KubeCon evironment up and running and it makes sense to not impact that.

Will just iterate with Mike to get the proper LGTM

bbhuston commented 1 year ago

A larger discussion might be worth having on how we want to separate concerns on terraform infra management. One idea is by stages, I really like what the fast fabric team has put together or by persona: org-admin [game publishers], security team, infra/net team, data team, AAA studio dev 1, AAA studio dev 2, etc...

I only bring this up because the this org-policy kind of touches on how we get to a "golden" gcp landing page quickly before we can actually start doing cool stuff! or should it always just be bundled together 💭

Your ESP is dead on. We're having this discussions right now about how best to "productionize" this setup. Have triaged with both the Fabric Team and the CFT (Cloud Foundations Toolkit) to see what org-level building blocks make the most sense for the long term. IMHO, there will probably be two flavors of this demo available in the future -- 1) the 'quick and easy' project-scoped demo where all the personas are lumped together, and 2) an org-level, multiple project, multiple persona version that respects seperation of concerns and so forth. Both approaches are valuable at different times to different audiences.

markmandel commented 1 year ago

A larger discussion might be worth having on how we want to separate concerns on terraform infra management. One idea is by stages, I really like what the fast fabric team has put together or by persona: org-admin [game publishers], security team, infra/net team, data team, AAA studio dev 1, AAA studio dev 2, etc...

I only bring this up because the this org-policy kind of touches on how we get to a "golden" gcp landing page quickly before we can actually start doing cool stuff! or should it always just be bundled together thought_balloon

This actually makes me thing - this probably shouldn't live under infrastructure - it should be a top level org-policy folder.

Org policies aren't infra, they are policies -- so let's treat them as their own top level entity and document them as such. WDYT?

bbhuston commented 1 year ago

older.

Org policies aren't infra, they are policies -- so let's treat them as their own top level entity and document them as such. WDYT?

Org policies are not conceptually different than IAM permissions or terraform resources that stub out the activation of specific APIs (like compute) -- both of which lumped into infrastructure today. And all this is managed by Terraform and used in the provisioning process. Personally I think that putting this all in infrastructure makes the most sense.

@mbychkowski What do you think?

mbychkowski commented 1 year ago

Org policies are not conceptually different than IAM permissions or terraform resources that stub out the activation of specific APIs (like compute) -- both of which lumped into infrastructure today. And all this is managed by Terraform and used in the provisioning process. Personally I think that putting this all in infrastructure makes the most sense.

A bit of a semantic argument. By infrastructure, I think we really just mean cloud or things managed by terraform. Inside GCP I tend to think of things as either 1) org / folder scoped and 2) project scoped and 3) gke scoped. To this end I tend to agree with @bbhuston. It makes sense for everything TF related to live inside a common directory whether we call it infrastructure, cloud, or tf. Maybe not so much 3) gke scoped but will just focus on 1) and 2) for this comment.

Inside this directory I think this is a valid place to separate things out between, at a minimum, bootstrap phase. Things that create the resource hierarchy and org policies. And the infrastructure phase things that tend to be project scoped.

A couple benefits here 1) Everyone might not need to run a bootstrapping process, but it exists there as a module in case it is needed, and they can focus on just the infrastructure. 2) Having these available as a flexible module where users can dynamically easily add or remove org-policies in an intuitive way. 3) If we want to go further than just bootstrap and infrastructure, moving to modules allows more flexibility to customers to take pieces to fill-in-the-gaps to their infra. We of course can have a few roadmaps on how to put these modules together whether it is all in one project or across teams in an organization. An ideal end state here is users just point to this repository of modules to tie there own gcp gaming infrastructure together. Some ideas here: [gke-auto-game-servers, spanner, game-monitoring, game-analytics, baas, build-pipeline, virtual-workstations, identity-management, resource-management, org-policy, ...]

Less about "this is a strict architecture you adhere to with a push of a button" and more of a "here are some pieces to fill-in-the-gap in your gaming infrastructure". We of course can build out a couple of patterns for this as well for documentation.

markmandel commented 1 year ago

I was also just coming at this from a discovererability place as well - top level folders get descriptions in the README.md and are immediately visible - so it's would aid with onboarding and initial exploration perspective.

You are both right in terms of terminology, I'm just thinking about it in terms of how it is used and how easy it would be to find.

markmandel commented 1 year ago

...where did we end up landing on this PR?

googleforgames / global-multiplayer-demo

Explicitly address org policy requirements (Closes #158) #171