GoogleCloudPlatform / pubsec-declarative-toolkit

The GCP PubSec Declarative Toolkit is a collection of declarative solutions to help you on your Journey to Google Cloud. Solutions are designed using Config Connector and deployed using Config Controller.
Apache License 2.0
30 stars 26 forks source link

feat: Shared VPC - distinct host projects BREAKING CHANGE #883

Closed alaincormier-ssc closed 2 months ago

alaincormier-ssc commented 3 months ago

BREAKING CHANGE

A SPIKE on VPC SC has determined that traffic leaving GKE and GCE going towards google API is always identified as coming from the host project by VPC service control. In order to effectively manage a perimeter around PBMM projects, a dedicated host project is needed for that data classification. Then, the new perimeter will include the PBMM host and service projects.

Migration Strategy

The host project's network functionality will be separated in two new projects, one for nonp network, one for pbmm network. DNS resources will be moved to a new dedicated project. The existing and new projects will co-exist during the transition. Once the transition is complete, the host project that is no longer used will be removed.

High level projects hierarchy of current/transition/final states (for illustration only, project names can be ignored): image

The migration strategy will consist of four stages:

  1. Preparation: update required packages in the pubsec-declarative-toolkit repo (add new resources and add "legacy" label to resources that will be removed in a future release)
  2. Transition: deploy the "preparation" packages in the landing zone environments and then transition all network related resources
  3. Cutoff: after weeks/months (TBD), update required packages in the pubsec-declarative-toolkit repo to remove "legacy" resources
  4. Final: deploy the "cutoff" packages in the landing zone environments

The list of packages below have references to the host project and/or include firewall policies rules (fwpol). The table is showing which packages will be affected in each stage.

Package "Legacy" Ver. Prep Transition Cutoff Final
t1 core-landing-zone 0.8.0 Yes (0.9.0) Yes No No
t2 client-landing-zone (fwpol) 0.6.0 Yes (0.7.1) Yes Yes * (0.8.0) Yes
t2 client-setup 0.8.1 Yes (0.8.2) Yes No No
t2 client-project-setup 0.5.0 Yes * (0.6.0) Yes Yes * (0.7.0) Yes
t2 ids 0.2.2 No Yes No No
t2 gke-admin-proxy 0.1.5 No Yes No No
t2 gke-setup 0.2.5 No Yes No No
t3 gke-defaults 0.2.3 Yes (0.3.0) Yes No No
t3 gke-cluster-autopilot (fwpol) 0.3.0 Yes * (0.4.1) Yes No No
t3 examples/tier3/dns n/a Yes Yes No No
t3 examples/tier3/external-load-balancer n/a No Yes No No
t3 examples/tier3/firewall-policy-rules (fwpol only) n/a Yes Yes No No
t3 examples/tier3/https-external-load-balancer n/a No Yes No No
t3 examples/tier3/remote-access-to-gce n/a No Yes No No
t4 examples/tier4/managed-instance-group n/a No Yes No No

*BREAKING CHANGE, the new package version will only work with the new design.

1 - Preparation (pubsec)

The upstream package updates required to be able to run both solutions in parallel for a smoother transition.

  1. core-landing-zone package (PR 889):
    • rename the dns-project-id setter to core-dns-project-id to avoid confusion with the upcoming new dns project in the client-landing-zone (see client-landing-zone below).
  2. client-setup package (PR 919):
    • rename the dns-project-id setter to core-dns-project-id to avoid confusion with the upcoming new dns project in the client-landing-zone (see client-landing-zone below).
  3. client-landing-zone package (PR 890) (PR 899):
    • DNS
      • add new DNS project.
      • copy/update the host-project public-dns.yaml resources into it, however they will be placed in comment for easier transition. DNS resources for the client will be deployed in this project instead of the host project. The same zone can't exist twice, and a zone can't be deleted if it contains recordset(s).
      • add a notice in the host-project public-dns.yaml that it will be transitioned.
    • VPC host project
      • add new nonp and pbmm host projects, keeping only appropriate subnets.
      • label resources under client-folder/standard/applications-infrastructure/host-project as "legacy" for post-transition cleanup.
    • Firewall policies rules:
      • label "standard" and "applications-infrastructure" folder firewall-policy as "legacy" for post-transition cleanup.
      • replicate the client-folder/standard/applications-infrastructure/firewall-policy resources under the pbmm AND nonp folders (adjusting resource names and references accordingly).
  4. client-project-setup package (PR 891):
    • group shared vpc resources in separate package folder for easier service project detach/attach.
    • refactor setters to accommodate for either pbmm or nonp subnets permissions and proper hierarchy placement.
    • label existing dns admin and fw admin permissions as legacy and duplicate corresponding permissions to new project/folder.
  5. gke-cluster-autopilot package (PR 893):
    • firewall policy rules in application-infrastructure-folder/firewall.yaml must reference (spec.firewallPolicyRef) the proper "classification" folder. A setter approach similar to client-project-setup could be used.
  6. gke-defaults package (PR 892):
    • cleanup setter instructions to dns-project-id
    • cleanup the host-project-id setter which doesn't appear to be used
  7. examples/tier3/dns package (PR 894):
    • update setter to use dns-project-id instead of host-project-id
  8. examples/tier3/firewall-policy-rules package (PR 894):
    • edit spec.firewallPolicyRef to point to "classification" folder

2 - Transition (deploy)

During the transition period, all existing service projects must be detached from the original host project. These steps are to update the landing zone environments. The order is important.

Important note, a service project can only be detached if no resource uses the host project network. The detach/attach must be done in separate commits due to immutable fields.

  1. core-landing-zone package:
    • update to "prep" version
    • the dns-project-id setter is renamed to core-dns-project-id, the value must remain the same. The rendered resources should not change.
  2. client-setup package(s):
    • update to "prep" version
    • the dns-project-id setter is renamed to core-dns-project-id, the value must remain the same. The rendered resources should not change.
  3. client-landing-zone package(s):
    • update to "prep" version, paying close attention to setters.yaml changes.
    • look out for DNS setters.
      • core-dns-project-id must now reference the core dns project created in core-landing-zone
      • dns-project-id must now reference the new client dns project that will be created
    • if GKE is used, the "constraints/compute.restrictVpcPeering" will need to be uncommented for each of the new host projects:
      • client-folder/standard/applications-infrastructure/nonp/host-project/org-policies/exceptions/compute-restrict-vpc-peering-except-host-project.yaml
      • client-folder/standard/applications-infrastructure/pbmm/host-project/org-policies/exceptions/compute-restrict-vpc-peering-except-host-project.yaml
    • the glcoud PSC workaround must be done for each of the new host project, privilege escalation required.
  4. client-project-setup package(s):
    • for all service projects, update the package to "prep" version, paying close attention to setters.yaml changes.
    • do NOT change the host-project-id value, this will be done in the next section.

The remaining steps will depend on what's already deployed in the landing zone, the order can be adjusted as required.

  1. ids package:
    • add package for new host project (the current package could be renamed to something like ids-legacy).
  2. gke-admin-proxy package:
    • deploy proxy to use new host project(s).
  3. migrate client DNS to new project (this could be done during the client service project work, once all recordset(s) are removed) down-time required:
    1. remove all tier3 DNS records.
    2. remove DNS client zone from legacy host project (comment client-folder/standard/applications-infrastructure/host-project/network/public-dns.yaml).
    3. create DNS client zone in the DNS project in a separate commit to avoid resource conflict (uncomment client-folder/standard/applications-infrastructure/dns-project/public-dns.yaml)
    4. if applicable, re-add tier3 DNS records in new DNS project.
  4. for each* client service project:
    1. remove all tier2/3/4 and application packages that are using network resources (see "Transition" column of table).
    2. in client-project-setup package, detach from the legacy host project (remove/comment all resources in shared-vpc folder).
    3. in client-project-setup package, attach to the new host project, if needed. (update the host-project-id setter and re-add/uncomment all resources in shared-vpc folder).
    4. redeploy tier2/3/4 and application packages as required, using the latest package versions and pointing to the new host projects.
  5. gke-admin-proxy package:
    • remove all proxies attached to the legacy host project.
  6. ids package:
    • remove endpoint(s) and the ids package and attached to the legacy host project.

*In some cases, if the application cannot have down time, a new service project (attached the new host project) may be required to migrate the workload.

3 - Cut-off / Post Transition Cleanup (pubsec)

This is where applicable upstream packages are updated to only use the new host projects (remove resources labeled as "legacy").

After a certain amount of time (TBD), update the following:

  1. client-landing-zone package (PR 922):
    • remove resources labeled as "legacy".
    • uncomment .../dns-project/public-dns.yaml.
    • remove setters host-project-id, standard-nonp-cidr and standard-pbmm-cidr.
  2. client-project-setup package (PR 921):
    • remove resources labeled as "legacy".

4 - Final State (deploy)

At this stage, all service projects have migrated to the new host projects and no resources are attached to the legacy host project.

The latest version of packages will now be deployed to the landing zone environments, in order:

  1. client-project-setup package(s):
    • update the package to "cutoff" version for all service projects
  2. client-landing-zone package(s):
    • update the package to "cutoff" version
    • look out for removed setters (host-project-id, standard-nonp-cidr and standard-pbmm-cidr).