GoogleCloudPlatform / k8s-config-connector

GCP Config Connector, a Kubernetes add-on for managing GCP resources
https://cloud.google.com/config-connector/docs/overview
Apache License 2.0
886 stars 217 forks source link

Updating `targetResources` on ComputeFirewallPolicyRule does not always result in updates in GCP #1237

Open npitts0811 opened 7 months ago

npitts0811 commented 7 months ago

Checklist

Bug Description

When updating external network ref values in a ComputeFirewallPolicyRule resource's targetResources field, certain conditions result in either 1) the change not taking effect in GCP despite the k8s resource showing as UpToDate or 2) the resource enters UpdateFailed state with the following message:

Update call failed: error generating the diffs from desired state: required value "priority" could not be found

Additional Diagnostic Information

Success scenarios:

Failure scenarios:

We were not able to identify what causes the ComputeFirewallPolicyRule resource to remain in UpToDate with no changes in GCP (scenario 1 in Bug Description) vs the resource entering UpdateFailed state with the "priority" message (scenario 2 in Bug Description).

Kubernetes Cluster Version

v1.26.11-gke.1055000

Config Connector Version

1.111.0

Config Connector Mode

namespaced mode (default)

Log Output

No response

Steps to reproduce the issue

  1. Create a ComputeFirewallPolicyRule with targetResources and at least one external network path.
  2. Do one of the following after the resource is created:
    • simultaneously add a new element to the targetResources list and remove an existing one.
    • edit one of the existing lines

The resulting resource will either remain UpToDate with no changes to GCP or enter UpdateFailed with the message below:

Update call failed: error generating the diffs from desired state: required value "priority" could not be found

YAML snippets

# initial resource
---
apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeFirewallPolicyRule
metadata:
  name: example-computefirewallpolicyrule
spec:
  action: "allow"
  description: "Test changing targets"
  direction: "EGRESS"
  disabled: false
  enableLogging: true
  firewallPolicyRef:
    external: firewallPolicies/123456789098764
  match:
    layer4Configs:
    - ipProtocol: "tcp"
      ports:
        - "443"
    destFqdns:
    - "example-fqdn.me"
  priority: 100
  targetResources:
    - external: https://www.googleapis.com/compute/v1/projects/my-example-project-00/global/networks/example-net

# example 1 - remove existing targetResource while adding new ones
---
apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeFirewallPolicyRule
metadata:
  name: example-computefirewallpolicyrule
spec:
  action: "allow"
  description: "Test changing targets"
  direction: "EGRESS"
  disabled: false
  enableLogging: true
  firewallPolicyRef:
    external: firewallPolicies/123456789098764
  match:
    layer4Configs:
    - ipProtocol: "tcp"
      ports:
        - "443"
    destFqdns:
    - "example-fqdn.me"
  priority: 100
  targetResources:
    - external: https://www.googleapis.com/compute/v1/projects/my-example-project01/global/networks/example-net
    - external: https://www.googleapis.com/compute/v1/projects/my-example-project02/global/networks/example-net

# example 2 - updating an existing targetResources element (update project name in this case)
---
apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeFirewallPolicyRule
metadata:
  name: example-computefirewallpolicyrule
spec:
  action: "allow"
  description: "Test changing targets"
  direction: "EGRESS"
  disabled: false
  enableLogging: true
  firewallPolicyRef:
    external: firewallPolicies/123456789098764
  match:
    layer4Configs:
    - ipProtocol: "tcp"
      ports:
        - "443"
    destFqdns:
    - "example-fqdn.me"
  priority: 100
  targetResources:
    - external: https://www.googleapis.com/compute/v1/projects/my-example-project-05/global/networks/example-net
mikesharpton-bestbuy commented 3 months ago

We are seeing this issue once again and outside of the targetResources. We have seen this behavior happen when adding destIPRanges. We now have two rules that are in this state and we are not sure why. We tried to re-create this error and have failed to do so. We are going to open a support ticket and reference this issue.

status:
  conditions:
  - lastTransitionTime: "2024-06-13T17:47:48Z"
    message: 'Update call failed: error generating the diffs from desired state: required
      value "priority" could not be found'
    reason: UpdateFailed
    status: "False"
    type: Ready
  kind: compute#firewallPolicyRule
  observedGeneration: 2
  ruleTupleCount: 14

When I describe the object the priority is there in the spec.

spec:
  action: allow
  description: Test
  direction: EGRESS
  enableLogging: true
  firewallPolicyRef:
    name: policyname
  match:
    destIPRanges:
    - 1.2.3.5/32
    - 1.2.3.4/32
    layer4Configs:
    - ipProtocol: tcp
      ports:
      - "443"
  priority: 201001

The rule did update as all we did was add IP's to the list. The rule is now in a state that it likely cannot be recovered from. We have left the objects in place in case support may want to see them.

We are now on this version of Config Connector. 1.114.1

cheftako commented 3 months ago

The sample yaml references a FirewallPolicy (firewallPolicyRef: external: firewallPolicies/123456789098764) but that yaml is not provided? Would it be possible to get a copy of a sample yaml for that?

mikesharpton-bestbuy commented 3 months ago

Hello,

Here is a scrubbed example of a firewallpolicy that the broken rules are in.

apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeFirewallPolicy
metadata:
  annotations:
    cnrm.cloud.google.com/management-conflict-prevention-policy: none
    cnrm.cloud.google.com/state-into-spec: merge
  name: production-01
  namespace: coolnamespace
spec:
  description: An organizational level policy to associate to all production folders.
    Define rules in p01 branch of a cool repo.
  organizationRef:
    external: organizations/111111111
  shortName: nottherealname