ovn-org / ovn-kubernetes

A robust Kubernetes networking platform
https://ovn-kubernetes.io/
Apache License 2.0
773 stars 334 forks source link

[ci-flake] ACL Logging for AdminNetworkPolicy and BaselineAdminNetworkPolicy #4242

Closed tssurya closed 3 months ago

tssurya commented 3 months ago

@trozet saw flake here: https://github.com/ovn-org/ovn-kubernetes/pull/4241#issuecomment-2021315608

tssurya commented 3 months ago

Run1:

2024-03-26T17:40:59.7812274Z ACL Logging for AdminNetworkPolicy and BaselineAdminNetworkPolicy the ANP ACL logs have the expected log level
2024-03-26T17:40:59.7813943Z /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/acl_logging.go:222
2024-03-26T17:40:59.7815332Z   STEP: Creating a kubernetes client @ 03/26/24 17:40:59.779
2024-03-26T17:40:59.7816385Z   Mar 26 17:40:59.779: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-26T17:40:59.7817946Z   STEP: Building a namespace api object, basename anp-subject @ 03/26/24 17:40:59.78
2024-03-26T17:40:59.7915528Z   Mar 26 17:40:59.786: INFO: Skipping waiting for service account
2024-03-26T17:40:59.7917048Z   STEP: creating an admin network policy @ 03/26/24 17:40:59.786
2024-03-26T17:40:59.7919388Z   Mar 26 17:40:59.787: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default create -f anp.yaml'
2024-03-26T17:40:59.9114909Z   Mar 26 17:40:59.908: INFO: stderr: ""
2024-03-26T17:40:59.9116554Z   Mar 26 17:40:59.908: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io/harry-potter created\n"
2024-03-26T17:40:59.9118324Z   STEP: configuring the ACL logging level for the ANP @ 03/26/24 17:40:59.908
2024-03-26T17:40:59.9122059Z   Mar 26 17:40:59.908: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default annotate anp harry-potter k8s.ovn.org/acl-logging={ "deny": "alert", "allow": "notice", "pass": "warning" } --overwrite=true'
2024-03-26T17:40:59.9776074Z   Mar 26 17:40:59.977: INFO: stderr: ""
2024-03-26T17:40:59.9777720Z   Mar 26 17:40:59.977: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io/harry-potter annotated\n"
2024-03-26T17:40:59.9779704Z   STEP: creating peer namespaces that are selected by the admin network policy @ 03/26/24 17:40:59.977
2024-03-26T17:40:59.9782466Z   Mar 26 17:40:59.977: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default create ns anp-peer-restricted'
2024-03-26T17:41:00.0342146Z   Mar 26 17:41:00.033: INFO: stderr: ""
2024-03-26T17:41:00.0343493Z   Mar 26 17:41:00.033: INFO: stdout: "namespace/anp-peer-restricted created\n"
2024-03-26T17:41:00.0345848Z   Mar 26 17:41:00.033: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default create ns anp-peer-open'
2024-03-26T17:41:00.1184926Z   Mar 26 17:41:00.118: INFO: stderr: ""
2024-03-26T17:41:00.1186805Z   Mar 26 17:41:00.118: INFO: stdout: "namespace/anp-peer-open created\n"
2024-03-26T17:41:00.1189279Z   Mar 26 17:41:00.118: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default create ns anp-peer-unknown'
2024-03-26T17:41:00.1779665Z   Mar 26 17:41:00.177: INFO: stderr: ""
2024-03-26T17:41:00.1781815Z   Mar 26 17:41:00.177: INFO: stdout: "namespace/anp-peer-unknown created\n"
2024-03-26T17:41:00.1783994Z   STEP: creating pods in subject and peer namespaces @ 03/26/24 17:41:00.177
2024-03-26T17:41:00.1940303Z   W0326 17:41:00.192817   61726 warnings.go:70] would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "pod-anp-subject-1612" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "pod-anp-subject-1612" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "pod-anp-subject-1612" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "pod-anp-subject-1612" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
2024-03-26T17:41:04.2070982Z   Mar 26 17:41:04.206: INFO: Created pod-anp-subject-1612 in namespace anp-subject-1612
2024-03-26T17:41:08.2228318Z   Mar 26 17:41:08.222: INFO: Created pod-anp-peer-restricted in namespace anp-peer-restricted
2024-03-26T17:41:12.2392107Z   Mar 26 17:41:12.238: INFO: Created pod-anp-peer-open in namespace anp-peer-open
2024-03-26T17:41:16.2558024Z   Mar 26 17:41:16.255: INFO: Created pod-anp-peer-unknown in namespace anp-peer-unknown
2024-03-26T17:41:16.2559175Z   STEP: sending traffic between acl-logging test pods we trigger ALLOW ACL logging @ 03/26/24 17:41:16.255
2024-03-26T17:41:16.2560467Z   Mar 26 17:41:16.255: INFO: Poke pod pod-anp-peer-restricted (on node ovn-worker) from pod pod-anp-subject-1612 (on node ovn-worker)
2024-03-26T17:41:16.2582222Z   Mar 26 17:41:16.257: INFO: ExecWithOptions {Command:[/bin/sh -c curl --output /dev/stdout -m 1 -I [fd00:10:244:2::5]:8000 | head -n1] Namespace:anp-subject-1612 PodName:pod-anp-subject-1612 ContainerName:pod-anp-subject-1612 Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-26T17:41:16.2583865Z   Mar 26 17:41:16.257: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-26T17:41:16.2585278Z   Mar 26 17:41:16.258: INFO: ExecWithOptions: Clientset creation
2024-03-26T17:41:16.2588391Z   Mar 26 17:41:16.258: INFO: ExecWithOptions: execute(POST https://[::1]:36319/api/v1/namespaces/anp-subject-1612/pods/pod-anp-subject-1612/exec?command=%2Fbin%2Fsh&command=-c&command=curl+--output+%2Fdev%2Fstdout+-m+1+-I+%5Bfd00%3A10%3A244%3A2%3A%3A5%5D%3A8000+%7C+head+-n1&container=pod-anp-subject-1612&container=pod-anp-subject-1612&stderr=true&stdout=true)
2024-03-26T17:41:17.3189549Z   [FAILED] in [It] - /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/acl_logging.go:234 @ 03/26/24 17:41:17.318
2024-03-26T17:41:17.9642978Z   STEP: deleting the admin network policy @ 03/26/24 17:41:17.963
2024-03-26T17:41:17.9645410Z   Mar 26 17:41:17.964: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default delete anp harry-potter --ignore-not-found=true'
2024-03-26T17:41:18.0240518Z   Mar 26 17:41:18.023: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-26T17:41:18.0242557Z   Mar 26 17:41:18.023: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io \"harry-potter\" deleted\n"
2024-03-26T17:41:18.0244273Z   STEP: deleting the baseline admin network policy @ 03/26/24 17:41:18.023
2024-03-26T17:41:18.0247367Z   Mar 26 17:41:18.023: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default delete banp default --ignore-not-found=true'
2024-03-26T17:41:18.0747366Z   Mar 26 17:41:18.074: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-26T17:41:18.0748625Z   Mar 26 17:41:18.074: INFO: stdout: ""
2024-03-26T17:41:18.0751006Z   STEP: deleting subject and peer namespaces that are selected by the admin network policy @ 03/26/24 17:41:18.074
2024-03-26T17:41:18.0754023Z   Mar 26 17:41:18.074: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default delete ns anp-subject-1612 --ignore-not-found=true'
2024-03-26T17:41:28.3568676Z   Mar 26 17:41:28.355: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-26T17:41:28.3570634Z   Mar 26 17:41:28.355: INFO: stdout: "namespace \"anp-subject-1612\" deleted\n"
2024-03-26T17:41:28.3573627Z   Mar 26 17:41:28.356: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default delete ns anp-peer-restricted --ignore-not-found=true'
2024-03-26T17:41:38.6439407Z   Mar 26 17:41:38.643: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-26T17:41:38.6441202Z   Mar 26 17:41:38.643: INFO: stdout: "namespace \"anp-peer-restricted\" deleted\n"
2024-03-26T17:41:38.6443744Z   Mar 26 17:41:38.643: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default delete ns anp-peer-open --ignore-not-found=true'
2024-03-26T17:41:48.9146545Z   Mar 26 17:41:48.914: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-26T17:41:48.9149191Z   Mar 26 17:41:48.914: INFO: stdout: "namespace \"anp-peer-open\" deleted\n"
2024-03-26T17:41:48.9152350Z   Mar 26 17:41:48.914: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default delete ns anp-peer-unknown --ignore-not-found=true'
2024-03-26T17:41:59.2458428Z   Mar 26 17:41:59.245: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-26T17:41:59.2460714Z   Mar 26 17:41:59.245: INFO: stdout: "namespace \"anp-peer-unknown\" deleted\n"
2024-03-26T17:41:59.2462342Z   STEP: dump namespace information after failure @ 03/26/24 17:41:59.245
2024-03-26T17:41:59.2464383Z   STEP: Collecting events from namespace "anp-subject-1612". @ 03/26/24 17:41:59.245
2024-03-26T17:41:59.2484273Z   STEP: Found 0 events. @ 03/26/24 17:41:59.248
2024-03-26T17:41:59.2498879Z   Mar 26 17:41:59.249: INFO: POD  NODE  PHASE  GRACE  CONDITIONS
tssurya commented 3 months ago

running this locally seems to work :/ ofc it can't be that simple..

tssurya commented 3 months ago

one thing to note is the test that runs before this test in the v6 lane was e2e br-int flow monitoring export validation Should validate flow data of br-int is sent to an external gateway with sflow which basically restarts all ovnkube-node pods. I wonder if we are having an issue where the plumbing is take time? timestamp:

2024-03-26T17:40:55.3480786Z   Mar 26 17:40:55.347: INFO: Waiting for daemonset/ovnkube-node rollout to finish: 2 of 3 updated pods are available (222 seconds elapsed)
2024-03-26T17:40:57.3469776Z   Mar 26 17:40:57.346: INFO: Waiting for daemonset/ovnkube-node rollout to finish: 2 of 3 updated pods are available (224 seconds elapsed)
2024-03-26T17:40:59.3463029Z   Mar 26 17:40:59.345: INFO: resource "daemonset/ovnkube-node" successfully rolled out
2024-03-26T17:40:59.3532994Z   Mar 26 17:40:59.352: INFO: ExecWithOptions {Command:[ovs-vsctl find sflow] Namespace:ovn-kubernetes PodName:ovnkube-node-drhll ContainerName:ovnkube-controller Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-26T17:40:59.3535040Z   Mar 26 17:40:59.352: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-26T17:40:59.3537321Z   Mar 26 17:40:59.353: INFO: ExecWithOptions: Clientset creation
2024-03-26T17:40:59.3540541Z   Mar 26 17:40:59.353: INFO: ExecWithOptions: execute(POST https://[::1]:36319/api/v1/namespaces/ovn-kubernetes/pods/ovnkube-node-drhll/exec?command=ovs-vsctl&command=find&command=sflow&container=ovnkube-controller&container=ovnkube-controller&stderr=true&stdout=true)
2024-03-26T17:40:59.4254534Z   Mar 26 17:40:59.425: INFO: execOptions are {[ovs-vsctl find sflow] ovn-kubernetes ovnkube-node-drhll ovnkube-controller <nil> true true false false}
2024-03-26T17:40:59.4258065Z   Mar 26 17:40:59.425: INFO: ExecWithOptions {Command:[ovs-vsctl find sflow] Namespace:ovn-kubernetes PodName:ovnkube-node-ftr4m ContainerName:ovnkube-controller Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-26T17:40:59.4260600Z   Mar 26 17:40:59.425: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-26T17:40:59.4310425Z   Mar 26 17:40:59.426: INFO: ExecWithOptions: Clientset creation
2024-03-26T17:40:59.4313630Z   Mar 26 17:40:59.426: INFO: ExecWithOptions: execute(POST https://[::1]:36319/api/v1/namespaces/ovn-kubernetes/pods/ovnkube-node-ftr4m/exec?command=ovs-vsctl&command=find&command=sflow&container=ovnkube-controller&container=ovnkube-controller&stderr=true&stdout=true)
2024-03-26T17:40:59.5073147Z   Mar 26 17:40:59.506: INFO: execOptions are {[ovs-vsctl find sflow] ovn-kubernetes ovnkube-node-ftr4m ovnkube-controller <nil> true true false false}
2024-03-26T17:40:59.5076757Z   Mar 26 17:40:59.506: INFO: ExecWithOptions {Command:[ovs-vsctl find sflow] Namespace:ovn-kubernetes PodName:ovnkube-node-qgrqk ContainerName:ovnkube-controller Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-26T17:40:59.5080941Z   Mar 26 17:40:59.506: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-26T17:40:59.5082706Z   Mar 26 17:40:59.507: INFO: ExecWithOptions: Clientset creation
2024-03-26T17:40:59.5086263Z   Mar 26 17:40:59.507: INFO: ExecWithOptions: execute(POST https://[::1]:36319/api/v1/namespaces/ovn-kubernetes/pods/ovnkube-node-qgrqk/exec?command=ovs-vsctl&command=find&command=sflow&container=ovnkube-controller&container=ovnkube-controller&stderr=true&stdout=true)
2024-03-26T17:40:59.5809823Z   Mar 26 17:40:59.580: INFO: execOptions are {[ovs-vsctl find sflow] ovn-kubernetes ovnkube-node-qgrqk ovnkube-controller <nil> true true false false}
2024-03-26T17:40:59.7739979Z   STEP: Destroying namespace "netflow-test-2407" for this suite. @ 03/26/24 17:40:59.773
2024-03-26T17:40:59.7786672Z • [455.918 seconds]
2024-03-26T17:40:59.7787843Z ------------------------------
2024-03-26T17:40:59.7795297Z SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
2024-03-26T17:40:59.7798870Z P [PENDING]
2024-03-26T17:40:59.7802495Z Multi Homing multiple pods connected to the same OVN-K secondary network localnet OVN-K secondary network with a service running on the underlay when a policy is provisioned can communicate over a localnet secondary network from pod to gw ingress denyall, ingress policy should have no impact on egress [BUG, OCPBUGS-25928]
2024-03-26T17:40:59.7806605Z /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/multihoming.go:841
2024-03-26T17:40:59.7807636Z ------------------------------
2024-03-26T17:40:59.7809444Z SSSSSSSSSSSSSSS
2024-03-26T17:40:59.7810895Z ------------------------------
2024-03-26T17:40:59.7812274Z ACL Logging for AdminNetworkPolicy and BaselineAdminNetworkPolicy the ANP ACL logs have the expected log level
2024-03-26T17:40:59.7813943Z /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/acl_logging.go:222
2024-03-26T17:40:59.7815332Z   STEP: Creating a kubernetes client @ 03/26/24 17:40:59.779
2024-03-26T17:40:59.7816385Z   Mar 26 17:40:59.779: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-26T17:40:59.7817946Z   STEP: Building a namespace api object, basename anp-subject @ 03/26/24 17:40:59.78
2024-03-26T17:40:59.7915528Z   Mar 26 17:40:59.786: INFO: Skipping waiting for service account
2024-03-26T17:40:59.7917048Z   STEP: creating an admin network policy @ 03/26/24 17:40:59.786
2024-03-26T17:40:59.7919388Z   Mar 26 17:40:59.787: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default create -f anp.yaml'
2024-03-26T17:40:59.9114909Z   Mar 26 17:40:59.908: INFO: stderr: ""
2024-03-26T17:40:59.9116554Z   Mar 26 17:40:59.908: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io/harry-potter created\n"
2024-03-26T17:40:59.9118324Z   STEP: configuring the ACL logging level for the ANP @ 03/26/24 17:40:59.908
2024-03-26T17:40:59.9122059Z   Mar 26 17:40:59.908: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default annotate anp harry-potter k8s.ovn.org/acl-logging={ "deny": "alert", "allow": "notice", "pass": "warning" } --overwrite=true'
2024-03-26T17:40:59.9776074Z   Mar 26 17:40:59.977: INFO: stderr: ""
2024-03-26T17:40:59.9777720Z   Mar 26 17:40:59.977: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io/harry-potter annotated\n"
2024-03-26T17:40:59.9779704Z   STEP: creating peer namespaces that are selected by the admin network policy @ 03/26/24 17:40:59.977
2024-03-26T17:40:59.9782466Z   Mar 26 17:40:59.977: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default create ns anp-peer-restricted'
2024-03-26T17:41:00.0342146Z   Mar 26 17:41:00.033: INFO: stderr: ""
2024-03-26T17:41:00.0343493Z   Mar 26 17:41:00.033: INFO: stdout: "namespace/anp-peer-restricted created\n"
2024-03-26T17:41:00.0345848Z   Mar 26 17:41:00.033: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:36319 --kubeconfig=/home/runner/ovn.conf --namespace=default create ns anp-peer-open'
2024-03-26T17:41:00.1184926Z   Mar 26 17:41:00.118: INFO: stderr: ""
2024-03-26T17:41:00.1186805Z   Mar 26 17:41:00.118: INFO: stdout: "namespace/anp-peer-open created\n"
tssurya commented 3 months ago

Run2 failed: https://github.com/ovn-org/ovn-kubernetes/actions/runs/8444011412/job/23129408240?pr=4238

2024-03-27T00:11:43.7992396Z   STEP: Destroying namespace "anp-subject-9327" for this suite. @ 03/27/24 00:11:43.798
2024-03-27T00:11:43.8012379Z   Mar 27 00:11:43.801: INFO: Namespace anp-subject-9327 was already deleted
2024-03-27T00:11:43.8013522Z • [FAILED] [118.345 seconds]
2024-03-27T00:11:43.8015075Z ACL Logging for AdminNetworkPolicy and BaselineAdminNetworkPolicy [It] the ANP ACL logs have the expected log level
2024-03-27T00:11:43.8016327Z /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/acl_logging.go:222
2024-03-27T00:11:43.8016857Z 
2024-03-27T00:11:43.8017461Z   [FAILED] failed to get pod pod-anp-subject-3625: pods "pod-anp-subject-3625" not found
2024-03-27T00:11:43.8019033Z   In [It] at: /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.28.3/test/e2e/framework/pod/exec_util.go:127 @ 03/27/24 00:11:01.634
2024-03-27T00:11:43.8019897Z 
2024-03-27T00:11:43.8020493Z   There were additional failures detected.  To view them in detail run ginkgo -vv
2024-03-27T00:11:43.8021186Z ------------------------------
2024-03-27T00:11:43.8021704Z [ReportAfterSuite] Kubernetes e2e JUnit report
2024-03-27T00:11:43.8022571Z /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.28.3/test/e2e/framework/test_context.go:585
2024-03-27T00:11:43.8041460Z [ReportAfterSuite] PASSED [0.003 seconds]
2024-03-27T00:11:43.8043637Z ------------------------------
2024-03-27T00:11:43.8043958Z 
2024-03-27T00:11:43.8044141Z Summarizing 1 Failure:
2024-03-27T00:11:43.8045327Z   [FAIL] ACL Logging for AdminNetworkPolicy and BaselineAdminNetworkPolicy [It] the ANP ACL logs have the expected log level
2024-03-27T00:11:43.8047568Z   /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.28.3/test/e2e/framework/pod/exec_util.go:127
2024-03-27T00:11:43.8048143Z 
2024-03-27T00:11:43.8048384Z Ran 50 of 263 Specs in 1835.695 seconds
2024-03-27T00:11:43.8049237Z FAIL! -- 49 Passed | 1 Failed | 1 Pending | 212 Skipped
2024-03-27T00:11:43.8049934Z --- FAIL: TestE2E (1835.71s)

same thing the test ran after the reboot of ovnkube-node pods followed by multicast test:

2024-03-27T00:09:28.8731217Z   Mar 27 00:09:28.872: INFO: Waiting for daemonset/ovnkube-node rollout to finish: 2 of 3 updated pods are available (224 seconds elapsed)
2024-03-27T00:09:30.8815622Z   Mar 27 00:09:30.881: INFO: Waiting for daemonset/ovnkube-node rollout to finish: 2 of 3 updated pods are available (226 seconds elapsed)
2024-03-27T00:09:32.8743698Z   Mar 27 00:09:32.874: INFO: resource "daemonset/ovnkube-node" successfully rolled out
2024-03-27T00:09:32.8812574Z   Mar 27 00:09:32.880: INFO: ExecWithOptions {Command:[ovs-vsctl find sflow] Namespace:ovn-kubernetes PodName:ovnkube-node-8chbg ContainerName:ovnkube-controller Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-27T00:09:32.8814320Z   Mar 27 00:09:32.881: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-27T00:09:32.8816636Z   Mar 27 00:09:32.881: INFO: ExecWithOptions: Clientset creation
2024-03-27T00:09:32.8819162Z   Mar 27 00:09:32.881: INFO: ExecWithOptions: execute(POST https://[::1]:43543/api/v1/namespaces/ovn-kubernetes/pods/ovnkube-node-8chbg/exec?command=ovs-vsctl&command=find&command=sflow&container=ovnkube-controller&container=ovnkube-controller&stderr=true&stdout=true)
2024-03-27T00:09:32.9435236Z   Mar 27 00:09:32.942: INFO: execOptions are {[ovs-vsctl find sflow] ovn-kubernetes ovnkube-node-8chbg ovnkube-controller <nil> true true false false}
2024-03-27T00:09:32.9438002Z   Mar 27 00:09:32.943: INFO: ExecWithOptions {Command:[ovs-vsctl find sflow] Namespace:ovn-kubernetes PodName:ovnkube-node-lb8m7 ContainerName:ovnkube-controller Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-27T00:09:32.9440959Z   Mar 27 00:09:32.943: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-27T00:09:32.9441566Z   Mar 27 00:09:32.943: INFO: ExecWithOptions: Clientset creation
2024-03-27T00:09:32.9443419Z   Mar 27 00:09:32.943: INFO: ExecWithOptions: execute(POST https://[::1]:43543/api/v1/namespaces/ovn-kubernetes/pods/ovnkube-node-lb8m7/exec?command=ovs-vsctl&command=find&command=sflow&container=ovnkube-controller&container=ovnkube-controller&stderr=true&stdout=true)
2024-03-27T00:09:33.0115296Z   Mar 27 00:09:33.010: INFO: execOptions are {[ovs-vsctl find sflow] ovn-kubernetes ovnkube-node-lb8m7 ovnkube-controller <nil> true true false false}
2024-03-27T00:09:33.0118770Z   Mar 27 00:09:33.011: INFO: ExecWithOptions {Command:[ovs-vsctl find sflow] Namespace:ovn-kubernetes PodName:ovnkube-node-xtnbl ContainerName:ovnkube-controller Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-27T00:09:33.0121136Z   Mar 27 00:09:33.011: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-27T00:09:33.0122127Z   Mar 27 00:09:33.012: INFO: ExecWithOptions: Clientset creation
2024-03-27T00:09:33.0125177Z   Mar 27 00:09:33.012: INFO: ExecWithOptions: execute(POST https://[::1]:43543/api/v1/namespaces/ovn-kubernetes/pods/ovnkube-node-xtnbl/exec?command=ovs-vsctl&command=find&command=sflow&container=ovnkube-controller&container=ovnkube-controller&stderr=true&stdout=true)
2024-03-27T00:09:33.1199829Z   Mar 27 00:09:33.119: INFO: execOptions are {[ovs-vsctl find sflow] ovn-kubernetes ovnkube-node-xtnbl ovnkube-controller <nil> true true false false}
2024-03-27T00:09:33.3131558Z   STEP: Destroying namespace "netflow-test-9458" for this suite. @ 03/27/24 00:09:33.312
2024-03-27T00:09:33.3189055Z • [445.633 seconds]
2024-03-27T00:09:33.3189797Z ------------------------------
2024-03-27T00:09:33.3190463Z SS
2024-03-27T00:09:33.3191102Z ------------------------------
2024-03-27T00:09:33.3192592Z Multicast should be able to send multicast UDP traffic between nodes
2024-03-27T00:09:33.3194879Z /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/multicast.go:78
2024-03-27T00:09:33.3196270Z   STEP: Creating a kubernetes client @ 03/27/24 00:09:33.317
2024-03-27T00:09:33.3197328Z   Mar 27 00:09:33.317: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-27T00:09:33.3198686Z   STEP: Building a namespace api object, basename multicast @ 03/27/24 00:09:33.318
2024-03-27T00:09:33.3241140Z   Mar 27 00:09:33.323: INFO: Skipping waiting for service account
2024-03-27T00:09:33.3632076Z   STEP: creating a pod as a multicast source in node ovn-control-plane @ 03/27/24 00:09:33.362
2024-03-27T00:09:33.3796747Z   W0327 00:09:33.378595   62005 warnings.go:70] would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "pod-client" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "pod-client" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "pod-client" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "pod-client" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
2024-03-27T00:09:37.3973204Z   STEP: creating first multicast listener pod in node ovn-worker @ 03/27/24 00:09:37.396
2024-03-27T00:09:37.4079151Z   W0327 00:09:37.402805   62005 warnings.go:70] would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "pod-server1" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "pod-server1" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "pod-server1" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "pod-server1" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
2024-03-27T00:09:41.4141836Z   STEP: creating second multicast listener pod in node ovn-worker @ 03/27/24 00:09:41.413
2024-03-27T00:09:41.4245002Z   W0327 00:09:41.418527   62005 warnings.go:70] would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "pod-server2" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "pod-server2" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "pod-server2" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "pod-server2" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
2024-03-27T00:09:45.4377489Z   STEP: checking if pod server1 received multicast traffic @ 03/27/24 00:09:45.437
2024-03-27T00:09:45.4492382Z   STEP: checking if pod server2 does not received multicast traffic @ 03/27/24 00:09:45.449
2024-03-27T00:09:45.4523700Z   STEP: Destroying namespace "multicast-8619" for this suite. @ 03/27/24 00:09:45.452
2024-03-27T00:09:45.4573513Z • [12.139 seconds]
2024-03-27T00:09:45.4574192Z ------------------------------
2024-03-27T00:09:45.4574736Z S
2024-03-27T00:09:45.4575200Z ------------------------------
2024-03-27T00:09:45.4577187Z ACL Logging for AdminNetworkPolicy and BaselineAdminNetworkPolicy the ANP ACL logs have the expected log level
2024-03-27T00:09:45.4579241Z /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/acl_logging.go:222
2024-03-27T00:09:45.4580578Z   STEP: Creating a kubernetes client @ 03/27/24 00:09:45.456
2024-03-27T00:09:45.4581617Z   Mar 27 00:09:45.456: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-27T00:09:45.4583617Z   STEP: Building a namespace api object, basename anp-subject @ 03/27/24 00:09:45.457
2024-03-27T00:09:45.4632124Z   Mar 27 00:09:45.461: INFO: Skipping waiting for service account
2024-03-27T00:09:45.4634125Z   STEP: creating an admin network policy @ 03/27/24 00:09:45.461
2024-03-27T00:09:45.4636111Z   Mar 27 00:09:45.462: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default create -f anp.yaml'
2024-03-27T00:09:45.5850956Z   Mar 27 00:09:45.580: INFO: stderr: ""
2024-03-27T00:09:45.5852987Z   Mar 27 00:09:45.580: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io/harry-potter created\n"
2024-03-27T00:09:45.5855262Z   STEP: configuring the ACL logging level for the ANP @ 03/27/24 00:09:45.58
2024-03-27T00:09:45.5859307Z   Mar 27 00:09:45.581: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default annotate anp harry-potter k8s.ovn.org/acl-logging={ "deny": "alert", "allow": "notice", "pass": "warning" } --overwrite=true'
2024-03-27T00:09:45.6455572Z   Mar 27 00:09:45.645: INFO: stderr: ""
2024-03-27T00:09:45.6457645Z   Mar 27 00:09:45.645: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io/harry-potter annotated\n"
2024-03-27T00:09:45.6459982Z   STEP: creating peer namespaces that are selected by the admin network policy @ 03/27/24 00:09:45.645
2024-03-27T00:09:45.6463582Z   Mar 27 00:09:45.645: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default create ns anp-peer-restricted'
2024-03-27T00:09:45.7066771Z   Mar 27 00:09:45.706: INFO: stderr: ""
2024-03-27T00:09:45.7068693Z   Mar 27 00:09:45.706: INFO: stdout: "namespace/anp-peer-restricted created\n"
2024-03-27T00:09:45.7071163Z   Mar 27 00:09:45.706: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default create ns anp-peer-open'
2024-03-27T00:09:45.7715976Z   Mar 27 00:09:45.771: INFO: stderr: ""
2024-03-27T00:09:45.7717383Z   Mar 27 00:09:45.771: INFO: stdout: "namespace/anp-peer-open created\n"
2024-03-27T00:09:45.7719890Z   Mar 27 00:09:45.771: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default create ns anp-peer-unknown'
2024-03-27T00:09:45.8257001Z   Mar 27 00:09:45.825: INFO: stderr: ""
2024-03-27T00:09:45.8258944Z   Mar 27 00:09:45.825: INFO: stdout: "namespace/anp-peer-unknown created\n"
2024-03-27T00:09:45.8260828Z   STEP: creating pods in subject and peer namespaces @ 03/27/24 00:09:45.825
2024-03-27T00:09:45.8409345Z   W0327 00:09:45.839690   62005 warnings.go:70] would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "pod-anp-subject-3625" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "pod-anp-subject-3625" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "pod-anp-subject-3625" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "pod-anp-subject-3625" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
2024-03-27T00:09:49.8515441Z   Mar 27 00:09:49.851: INFO: Created pod-anp-subject-3625 in namespace anp-subject-3625
2024-03-27T00:09:53.8679578Z   Mar 27 00:09:53.867: INFO: Created pod-anp-peer-restricted in namespace anp-peer-restricted
2024-03-27T00:09:57.8838056Z   Mar 27 00:09:57.883: INFO: Created pod-anp-peer-open in namespace anp-peer-open
2024-03-27T00:10:01.9001161Z   Mar 27 00:10:01.899: INFO: Created pod-anp-peer-unknown in namespace anp-peer-unknown
2024-03-27T00:10:01.9003181Z   STEP: sending traffic between acl-logging test pods we trigger ALLOW ACL logging @ 03/27/24 00:10:01.899
2024-03-27T00:10:01.9005359Z   Mar 27 00:10:01.899: INFO: Poke pod pod-anp-peer-restricted (on node ovn-worker2) from pod pod-anp-subject-3625 (on node ovn-worker2)
2024-03-27T00:10:01.9022101Z   Mar 27 00:10:01.901: INFO: ExecWithOptions {Command:[/bin/sh -c curl --output /dev/stdout -m 1 -I [fd00:10:244:3::5]:8000 | head -n1] Namespace:anp-subject-3625 PodName:pod-anp-subject-3625 ContainerName:pod-anp-subject-3625 Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-27T00:10:01.9023858Z   Mar 27 00:10:01.901: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-27T00:10:01.9025116Z   Mar 27 00:10:01.902: INFO: ExecWithOptions: Clientset creation
2024-03-27T00:10:01.9029494Z   Mar 27 00:10:01.902: INFO: ExecWithOptions: execute(POST https://[::1]:43543/api/v1/namespaces/anp-subject-3625/pods/pod-anp-subject-3625/exec?command=%2Fbin%2Fsh&command=-c&command=curl+--output+%2Fdev%2Fstdout+-m+1+-I+%5Bfd00%3A10%3A244%3A3%3A%3A5%5D%3A8000+%7C+head+-n1&container=pod-anp-subject-3625&container=pod-anp-subject-3625&stderr=true&stdout=true)
2024-03-27T00:10:02.9617485Z   [FAILED] in [It] - /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/acl_logging.go:234 @ 03/27/24 00:10:02.961
2024-03-27T00:10:03.6944349Z   STEP: deleting the admin network policy @ 03/27/24 00:10:03.694
2024-03-27T00:10:03.6946280Z   Mar 27 00:10:03.694: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default delete anp harry-potter --ignore-not-found=true'
2024-03-27T00:10:03.7537091Z   Mar 27 00:10:03.753: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-27T00:10:03.7538953Z   Mar 27 00:10:03.753: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io \"harry-potter\" deleted\n"
2024-03-27T00:10:03.7540145Z   STEP: deleting the baseline admin network policy @ 03/27/24 00:10:03.753
2024-03-27T00:10:03.7541870Z   Mar 27 00:10:03.753: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default delete banp default --ignore-not-found=true'
2024-03-27T00:10:03.8029170Z   Mar 27 00:10:03.802: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-27T00:10:03.8030144Z   Mar 27 00:10:03.802: INFO: stdout: ""
2024-03-27T00:10:03.8031342Z   STEP: deleting subject and peer namespaces that are selected by the admin network policy @ 03/27/24 00:10:03.802
2024-03-27T00:10:03.8034182Z   Mar 27 00:10:03.802: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default delete ns anp-subject-3625 --ignore-not-found=true'
2024-03-27T00:10:14.0723023Z   Mar 27 00:10:14.071: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-27T00:10:14.0725001Z   Mar 27 00:10:14.071: INFO: stdout: "namespace \"anp-subject-3625\" deleted\n"
tssurya commented 3 months ago

Run3 fail: https://github.com/ovn-org/ovn-kubernetes/actions/runs/8441765939/job/23122551508?pr=4239

tssurya commented 3 months ago

Let's pick: https://github.com/ovn-org/ovn-kubernetes/actions/runs/8444011412/job/23129408240?pr=4238 this run: https://github.com/ovn-org/ovn-kubernetes/issues/4242#issuecomment-2022389975

Test failure timestamp:

2024-03-27T00:10:01.9003181Z   STEP: sending traffic between acl-logging test pods we trigger ALLOW ACL logging @ 03/27/24 00:10:01.899
2024-03-27T00:10:01.9005359Z   Mar 27 00:10:01.899: INFO: Poke pod pod-anp-peer-restricted (on node ovn-worker2) from pod pod-anp-subject-3625 (on node ovn-worker2)
2024-03-27T00:10:01.9022101Z   Mar 27 00:10:01.901: INFO: ExecWithOptions {Command:[/bin/sh -c curl --output /dev/stdout -m 1 -I [fd00:10:244:3::5]:8000 | head -n1] Namespace:anp-subject-3625 PodName:pod-anp-subject-3625 ContainerName:pod-anp-subject-3625 Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}
2024-03-27T00:10:01.9023858Z   Mar 27 00:10:01.901: INFO: >>> kubeConfig: /home/runner/ovn.conf
2024-03-27T00:10:01.9025116Z   Mar 27 00:10:01.902: INFO: ExecWithOptions: Clientset creation
2024-03-27T00:10:01.9029494Z   Mar 27 00:10:01.902: INFO: ExecWithOptions: execute(POST https://[::1]:43543/api/v1/namespaces/anp-subject-3625/pods/pod-anp-subject-3625/exec?command=%2Fbin%2Fsh&command=-c&command=curl+--output+%2Fdev%2Fstdout+-m+1+-I+%5Bfd00%3A10%3A244%3A3%3A%3A5%5D%3A8000+%7C+head+-n1&container=pod-anp-subject-3625&container=pod-anp-subject-3625&stderr=true&stdout=true)
2024-03-27T00:10:02.9617485Z   [FAILED] in [It] - /home/runner/work/ovn-kubernetes/ovn-kubernetes/test/e2e/acl_logging.go:234 @ 03/27/24 00:10:02.961

both pods are on ovn-worker2

Event sequence:

2024-03-27T00:09:45.575842675Z stderr F I0327 00:09:45.575715   38180 admin_network_policy_controller.go:308] Adding Admin Network Policy harry-potter
2024-03-27T00:09:45.575858665Z stderr F I0327 00:09:45.575772   38180 admin_network_policy.go:61] Processing sync for Admin Network Policy harry-potter
2024-03-27T00:09:45.575951759Z stderr F I0327 00:09:45.575895   38180 types.go:131] Logging parameters for ANP harry-potter are Allow=/Deny=/Pass=
2024-03-27T00:09:45.575996683Z stderr F I0327 00:09:45.575957   38180 admin_network_policy.go:216] Creating ACL for rule 29000/Egress belonging to ANP harry-potter
2024-03-27T00:09:45.576086741Z stderr F I0327 00:09:45.576030   38180 admin_network_policy.go:216] Creating ACL for rule 28999/Egress belonging to ANP harry-potter
2024-03-27T00:09:45.576092672Z stderr F I0327 00:09:45.576059   38180 admin_network_policy.go:216] Creating ACL for rule 28998/Egress belonging to ANP harry-potter
2024-03-27T00:09:45.576163835Z stderr F I0327 00:09:45.576088   38180 admin_network_policy.go:137] Creating admin network policy harry-potter/10

that looks good to me. transact: 2024-03-27T00:09:45.577429263Z stderr F I0327 00:09:45.576746 38180 model_client.go:372] Create operations generated as: [{Op:insert Table:ACL Row:map[action:allow-related direction:from-lport external_ids:{GoMap:map[direction:Egress gress-index:0 k8s.ovn.org/id:default-network-controller:AdminNetworkPolicy:harry-potter:Egress:0:None k8s.ovn.org/name:harry-potter k8s.ovn.org/owner-controller:default-network-controller k8s.ovn.org/owner-type:AdminNetworkPolicy port-policy-protocol:None]} log:false match:((ip6.dst == $a10866216965385470963)) && inport == @a3602609661093065011 meter:{GoSet:[acl-logging]} name:{GoSet:[ANP:harry-potter:Egress:0]} options:{GoMap:map[apply-after-lb:true]} priority:29000 tier:1] Rows:[] Columns:[] Mutations:[] Timeout:<nil> Where:[] Until: Durable:<nil> Comment:<nil> Lock:<nil> UUID: UUIDName:u3418565856}]

tssurya commented 3 months ago

OVNK DB plumbing looks good

_uuid               : 14b627ba-7c3e-4106-ba7b-17a572b05fc2                                                                                                                   
action              : allow-related                                                                                                                                          
direction           : from-lport                                                                                                                                             
external_ids        : {direction=Egress, gress-index="0", "k8s.ovn.org/id"="default-network-controller:AdminNetworkPolicy:harry-potter:Egress:0:None", "k8s.ovn.org/name"=har
ry-potter, "k8s.ovn.org/owner-controller"=default-network-controller, "k8s.ovn.org/owner-type"=AdminNetworkPolicy, port-policy-protocol=None}                                
label               : 0                                                                                                                                                      
log                 : true                                                                                                                                                   
match               : "((ip6.dst == $a10866216965385470963)) && inport == @a3602609661093065011"                                                                             
meter               : acl-logging
name                : "ANP:harry-potter:Egress:0"
options             : {apply-after-lb="true"}
priority            : 29000
severity            : notice
tier                : 1
ovn-nbctl list address-set a10866216965385470963
_uuid               : 36b1813a-fd73-485d-9ddb-d6cb760d80c2
addresses           : ["fd00:10:244:3::5"]
external_ids        : {direction=Egress, gress-index="0", ip-family=v6, "k8s.ovn.org/id"="default-network-controller:AdminNetworkPolicy:harry-potter:Egress:0:v6", "k8s.ovn.org/name"=harry-potter, "k8s.ovn.org/owner-controller"=default-network-controller, "k8s.ovn.org/owner-type"=AdminNetworkPolicy}
name                : a10866216965385470963

PG:

_uuid               : 2f142a81-bd72-43c6-bdf8-7f76a4c58203                                                                                                                   
acls                : [081e0e9b-c17a-4b53-92ff-f1fa5cde16ba, 14b627ba-7c3e-4106-ba7b-17a572b05fc2, b2173e31-25c3-46c3-a47c-341249336303]                                     
external_ids        : {AdminNetworkPolicy=harry-potter, name="ANP:harry-potter"}                                                                                             
name                : a3602609661093065011                                                                                                                                   
ports               : [1d70150a-da85-49a8-ac48-6c2e54461537]
tssurya commented 3 months ago

so its failing right in the sending traffic stage, not the verify stage. ovn-controller logs:

2024-03-27T00:09:46.206168395Z stdout F 2024-03-27T00:09:46.204Z|00024|binding|INFO|Claiming lport anp-subject-3625_pod-anp-subject-3625 for this chassis.
2024-03-27T00:09:46.206182582Z stdout F 2024-03-27T00:09:46.204Z|00025|binding|INFO|anp-subject-3625_pod-anp-subject-3625: Claiming 0a:58:09:f2:f4:29 fd00:10:244:3::3
2024-03-27T00:09:46.216352089Z stdout F 2024-03-27T00:09:46.216Z|00026|binding|INFO|Setting lport anp-subject-3625_pod-anp-subject-3625 ovn-installed in OVS
2024-03-27T00:09:46.216364171Z stdout F 2024-03-27T00:09:46.216Z|00027|binding|INFO|Setting lport anp-subject-3625_pod-anp-subject-3625 up in Southbound
2024-03-27T00:09:49.194095805Z stdout F 2024-03-27T00:09:49.193Z|00028|lflow|WARN|error parsing actions "log(name="ANP:harry-potter:Egress:2", severity=warning, verdict=pass, meter="acl-logging__081e0e9b-c17a-4b53-92ff-f1fa5cde16ba"); next;": Syntax error at `pass' unknown verdict.
2024-03-27T00:09:50.212608334Z stdout F 2024-03-27T00:09:50.212Z|00029|binding|INFO|Claiming lport anp-peer-restricted_pod-anp-peer-restricted for this chassis.
2024-03-27T00:09:50.212622751Z stdout F 2024-03-27T00:09:50.212Z|00030|binding|INFO|anp-peer-restricted_pod-anp-peer-restricted: Claiming 0a:58:20:fd:27:ce fd00:10:244:3::5
2024-03-27T00:09:50.224430753Z stdout F 2024-03-27T00:09:50.224Z|00031|binding|INFO|Setting lport anp-peer-restricted_pod-anp-peer-restricted ovn-installed in OVS
2024-03-27T00:09:50.224442144Z stdout F 2024-03-27T00:09:50.224Z|00032|binding|INFO|Setting lport anp-peer-restricted_pod-anp-peer-restricted up in Southbound
=====
other pods:
2024-03-27T00:09:54.231929392Z stdout F 2024-03-27T00:09:54.231Z|00033|binding|INFO|Claiming lport anp-peer-open_pod-anp-peer-open for this chassis.
2024-03-27T00:09:54.232011786Z stdout F 2024-03-27T00:09:54.231Z|00034|binding|INFO|anp-peer-open_pod-anp-peer-open: Claiming 0a:58:4d:8c:81:27 fd00:10:244:3::6
2024-03-27T00:09:54.245522708Z stdout F 2024-03-27T00:09:54.245Z|00035|binding|INFO|Setting lport anp-peer-open_pod-anp-peer-open up in Southbound
2024-03-27T00:09:54.246227997Z stdout F 2024-03-27T00:09:54.245Z|00036|binding|INFO|Setting lport anp-peer-open_pod-anp-peer-open ovn-installed in OVS
2024-03-27T00:09:58.254130051Z stdout F 2024-03-27T00:09:58.253Z|00037|binding|INFO|Claiming lport anp-peer-unknown_pod-anp-peer-unknown for this chassis.
2024-03-27T00:09:58.254144077Z stdout F 2024-03-27T00:09:58.253Z|00038|binding|INFO|anp-peer-unknown_pod-anp-peer-unknown: Claiming 0a:58:bd:cd:8e:17 fd00:10:244:3::7
2024-03-27T00:09:58.265691768Z stdout F 2024-03-27T00:09:58.265Z|00039|binding|INFO|Setting lport anp-peer-unknown_pod-anp-peer-unknown ovn-installed in OVS
2024-03-27T00:09:58.265703018Z stdout F 2024-03-27T00:09:58.265Z|00040|binding|INFO|Setting lport anp-peer-unknown_pod-anp-peer-unknown up in Southbound

===== OH LOOK:

2024-03-27T00:10:01.957676688Z stdout F 2024-03-27T00:10:01.957Z|00004|acl_log(ovn_pinctrl0)|INFO|name="ANP:harry-potter:Egress:0", verdict=allow, severity=notice, direction=from-lport: tcp6,vlan_tci=0x0000,dl_src=0a:58:09:f2:f4:29,dl_dst=0a:58:20:fd:27:ce,ipv6_src=fd00:10:244:3::3,ipv6_dst=fd00:10:244:3::5,ipv6_label=0x3b68c,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=60076,tp_dst=8000,tcp_flags=syn
2024-03-27T00:10:09.069019166Z stdout F 2024-03-27T00:10:09.068Z|00041|binding|INFO|Releasing lport anp-subject-3625_pod-anp-subject-3625 from this chassis (sb_readonly=0)
2024-03-27T00:10:09.069092994Z stdout F 2024-03-27T00:10:09.068Z|00042|binding|INFO|Setting lport anp-subject-3625_pod-anp-subject-3625 down in Southbound
2024-03-27T00:10:19.355461822Z stdout F 2024-03-27T00:10:19.355Z|00043|binding|INFO|Releasing lport anp-peer-restricted_pod-anp-peer-restricted from this chassis (sb_readonly=0)
2024-03-27T00:10:19.355488261Z stdout F 2024-03-27T00:10:19.355Z|00044|binding|INFO|Setting lport anp-peer-restricted_pod-anp-peer-restricted down in Southbound
2024-03-27T00:10:29.660508995Z stdout F 2024-03-27T00:10:29.660Z|00045|binding|INFO|Releasing lport anp-peer-open_pod-anp-peer-open from this chassis (sb_readonly=0)
2024-03-27T00:10:29.660558387Z stdout F 2024-03-27T00:10:29.660Z|00046|binding|INFO|Setting lport anp-peer-open_pod-anp-peer-open down in Southbound
2024-03-27T00:10:39.875704094Z stdout F 2024-03-27T00:10:39.875Z|00047|binding|INFO|Releasing lport anp-peer-unknown_pod-anp-peer-unknown from this chassis (sb_readonly=0)
2024-03-27T00:10:39.875719743Z stdout F 2024-03-27T00:10:39.875Z|00048|binding|INFO|Setting lport anp-peer-unknown_pod-anp-peer-unknown down in Southbound

the ACL log: 2024-03-27T00:10:01.957676688Z stdout F 2024-03-27T00:10:01.957Z|00004|acl_log(ovn_pinctrl0)|INFO|name="ANP:harry-potter:Egress:0", verdict=allow, severity=notice, direction=from-lport: tcp6,vlan_tci=0x0000,dl_src=0a:58:09:f2:f4:29,dl_dst=0a:58:20:fd:27:ce,ipv6_src=fd00:10:244:3::3,ipv6_dst=fd00:10:244:3::5,ipv6_label=0x3b68c,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=60076,tp_dst=8000,tcp_flags=syn seems to indicate it did what its supposed to do! hmm why is the poke not working then. coincides with the timestamp: 2024-03-27T00:10:01.9022101Z Mar 27 00:10:01.901: INFO: ExecWithOptions {Command:[/bin/sh -c curl --output /dev/stdout -m 1 -I [fd00:10:244:3::5]:8000 | head -n1] Namespace:anp-subject-3625 PodName:pod-anp-subject-3625 ContainerName:pod-anp-subject-3625 Stdin:<nil> CaptureStdout:true CaptureStderr:true PreserveWhitespace:false Quiet:false}

tssurya commented 3 months ago

the second time around when we run it:

2024-03-27T00:10:45.5584017Z   STEP: creating pods in subject and peer namespaces @ 03/27/24 00:10:45.557
2024-03-27T00:10:45.5708935Z   W0327 00:10:45.570128   62005 warnings.go:70] would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "pod-anp-subject-9327" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "pod-anp-subject-9327" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "pod-anp-subject-9327" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "pod-anp-subject-9327" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
2024-03-27T00:10:49.5828215Z   Mar 27 00:10:49.582: INFO: Created pod-anp-subject-9327 in namespace anp-subject-9327
2024-03-27T00:10:53.5991397Z   Mar 27 00:10:53.598: INFO: Created pod-anp-peer-restricted in namespace anp-peer-restricted
2024-03-27T00:10:57.6154572Z   Mar 27 00:10:57.615: INFO: Created pod-anp-peer-open in namespace anp-peer-open
2024-03-27T00:11:01.6326073Z   Mar 27 00:11:01.632: INFO: Created pod-anp-peer-unknown in namespace anp-peer-unknown
2024-03-27T00:11:01.6327551Z   STEP: sending traffic between acl-logging test pods we trigger ALLOW ACL logging @ 03/27/24 00:11:01.632
2024-03-27T00:11:01.6328874Z   Mar 27 00:11:01.632: INFO: Poke pod pod-anp-peer-restricted (on node ovn-worker2) from pod pod-anp-subject-3625 (on node ovn-worker2)
2024-03-27T00:11:01.6348628Z   Mar 27 00:11:01.634: INFO: Unexpected error: failed to get pod pod-anp-subject-3625: 
2024-03-27T00:11:01.6349771Z       <*errors.StatusError | 0xc0027a1d60>: 
2024-03-27T00:11:01.6351156Z       pods "pod-anp-subject-3625" not found
2024-03-27T00:11:01.6351640Z       {
2024-03-27T00:11:01.6351897Z           ErrStatus: 
2024-03-27T00:11:01.6352234Z               code: 404
2024-03-27T00:11:01.6352546Z               details:
2024-03-27T00:11:01.6352870Z                 kind: pods
2024-03-27T00:11:01.6354071Z                 name: pod-anp-subject-3625
2024-03-27T00:11:01.6355104Z               message: pods "pod-anp-subject-3625" not found
2024-03-27T00:11:01.6355917Z               metadata: {}
2024-03-27T00:11:01.6356536Z               reason: NotFound
2024-03-27T00:11:01.6357171Z               status: Failure,
2024-03-27T00:11:01.6357672Z       }
2024-03-27T00:11:01.6359509Z   [FAILED] in [It] - /home/runner/go/pkg/mod/k8s.io/kubernetes@v1.28.3/test/e2e/framework/pod/exec_util.go:127 @ 03/27/24 00:11:01.634
2024-03-27T00:11:02.3289159Z   STEP: deleting the admin network policy @ 03/27/24 00:11:02.328
2024-03-27T00:11:02.3291966Z   Mar 27 00:11:02.328: INFO: Running '/usr/local/bin/kubectl --server=https://[::1]:43543 --kubeconfig=/home/runner/ovn.conf --namespace=default delete anp harry-potter --ignore-not-found=true'
2024-03-27T00:11:02.3924441Z   Mar 27 00:11:02.390: INFO: stderr: "Warning: deleting cluster-scoped resources, not scoped to the provided namespace\n"
2024-03-27T00:11:02.3992439Z   Mar 27 00:11:02.390: INFO: stdout: "adminnetworkpolicy.policy.networking.k8s.io \"harry-potter\" deleted\n"

we seem to have a stale cache entry for the pod? (argh that is because I forgot to cleanup cache in after each :D )

flavio-fernandes commented 3 months ago

cc @flavio-fernandes

tssurya commented 3 months ago

So this flake: https://github.com/ovn-org/ovn-kubernetes/actions/runs/8425068140/job/23071351305?pr=4202 is also when ACL logging test runs immediately after the e2e br-int flow monitoring export validation �[38;5;243mShould validate flow data of br-int is sent to an external gateway test. Let's dig deeper into this flake.

tssurya commented 3 months ago

yeah so same thing as before I see

2024-03-25T19:10:33.047469837Z stdout F 2024-03-25T19:10:33.047Z|00004|acl_log(ovn_pinctrl0)|INFO|name="ANP:harry-potter:Egress:0", verdict=allow, severity=notice, direction=from-lport: tcp6,vlan_tci=0x0000,dl_src=0a:58:49:a1:93:cb,dl_dst=0a:58:71:90:19:61,ipv6_src=fd00:10:244:2::3,ipv6_dst=fd00:10:244:2::4,ipv6_label=0x712b1,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=35562,tp_dst=8000,tcp_flags=syn

and then I see the failure, so we deff need to merge the current PR to get more coverage/logs on what that curl failure looks like

flavio-fernandes commented 3 months ago

cc @flavio-fernandes

saw flake here: https://github.com/ovn-org/ovn-kubernetes/actions/runs/8511378093/job/23311649085?pr=4103