aws-samples / eks-anywhere-addons

https://aws-samples.github.io/eks-anywhere-addons/
MIT No Attribution
21 stars 41 forks source link

Add NeuVector deployment to the test pipeline for EKS-A #25

Closed rjschwei closed 1 year ago

rjschwei commented 1 year ago

Description of changes:

Add NeuVector as a 3rd party partner app

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

elamaran11 commented 1 year ago

@rjschwei Thankyou for submitting a PR for your product. Appreciate you doing this. I see the PR is complete interms of the ISV Product Helm deployment. I would like to see a Functional Test (validation to check if your ISV product works fine functionally) which needs to be submitted to pass the PR. You can take a look at this example - https://github.com/aws-samples/eks-anywhere-addons/blob/main/eks-anywhere-common/Testers/Sample/testJob.yaml.

Also just curious, Do we need any product license to run this product on our labs. Could you please share us the license in email if that is a mandate?

elamaran11 commented 1 year ago

@rjschwei We tried to sync this GitOps repo with your changes for NeuVector to our labs with EKS Anywhere on snow. NeuVector failed in our tech validation. Please see the below failure logs :

❯ kga -n neuvector                                                                                             ─╯
NAME                                            READY   STATUS             RESTARTS        AGE
pod/neuvector-controller-pod-685d85447d-gwx22   0/1     CrashLoopBackOff   6 (97s ago)     8m55s
pod/neuvector-controller-pod-685d85447d-h7pwh   0/1     Running            5 (4m42s ago)   8m55s
pod/neuvector-controller-pod-685d85447d-swm7s   0/1     CrashLoopBackOff   5 (2m59s ago)   8m55s
pod/neuvector-enforcer-pod-7wzrm                0/1     CrashLoopBackOff   6 (2m3s ago)    8m55s
pod/neuvector-enforcer-pod-chccm                0/1     CrashLoopBackOff   6 (40s ago)     8m55s
pod/neuvector-enforcer-pod-fs2zf                0/1     CrashLoopBackOff   6 (2m4s ago)    8m56s
pod/neuvector-enforcer-pod-k7xbm                1/1     Running            5 (3m40s ago)   8m55s
pod/neuvector-enforcer-pod-qr8gr                0/1     CrashLoopBackOff   6 (2m21s ago)   8m55s
pod/neuvector-enforcer-pod-shffw                0/1     CrashLoopBackOff   6 (53s ago)     8m55s
pod/neuvector-manager-pod-8566d64b5c-t4qkh      1/1     Running            0               8m56s
pod/neuvector-scanner-pod-d79fb9b77-cdr5w       1/1     Running            0               8m54s
pod/neuvector-scanner-pod-d79fb9b77-gqcbp       1/1     Running            0               8m54s
pod/neuvector-scanner-pod-d79fb9b77-qgsnt       1/1     Running            0               8m55s

NAME                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                         AGE
service/neuvector-service-webui           NodePort    10.103.63.222    <none>        8443:31770/TCP                  9m
service/neuvector-svc-admission-webhook   ClusterIP   10.107.147.58    <none>        443/TCP                         9m
service/neuvector-svc-controller          ClusterIP   None             <none>        18300/TCP,18301/TCP,18301/UDP   9m1s
service/neuvector-svc-crd-webhook         ClusterIP   10.102.113.170   <none>        443/TCP                         8m59s

NAME                                    DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/neuvector-enforcer-pod   6         6         1       6            1           <none>          9m1s

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/neuvector-controller-pod   0/3     3            0           9m2s
deployment.apps/neuvector-manager-pod      1/1     1            1           9m2s
deployment.apps/neuvector-scanner-pod      3/3     3            3           9m2s

NAME                                                  DESIRED   CURRENT   READY   AGE
replicaset.apps/neuvector-controller-pod-685d85447d   3         3         0       9m2s
replicaset.apps/neuvector-manager-pod-8566d64b5c      1         1         1       9m2s
replicaset.apps/neuvector-scanner-pod-d79fb9b77       3         3         3       9m2s

NAME                                  SCHEDULE    SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/neuvector-updater-pod   0 0 * * *   False     0        <none>          9m6s

░▒▓    ~/eks-anywhere-conformance-testing    main *4 ?5 ▓▒░
❯ k logs neuvector-controller-pod-685d85447d-gwx22 -n neuvector                                                ─╯
2023-01-23T21:16:55|MON|/usr/local/bin/monitor starts, pid=1
2023-01-23T21:16:55|MON|Start ctrl, pid=7
2023-01-23T21:16:55|MON|Start opa, pid=8
2023-01-23T21:16:55.398|INFO|CTL|main.main: START - version=v5.1.0
2023-01-23T21:16:55.398|INFO|CTL|main.main: - join=neuvector-svc-controller.neuvector
2023-01-23T21:16:55.398|INFO|CTL|main.main: - advertise=10.1.4.223
2023-01-23T21:16:55.398|INFO|CTL|main.main: - bind=10.1.4.223
2023-01-23T21:16:55.4  |INFO|CTL|system.NewSystemTools: cgroup v1
2023-01-23T21:16:55.4  |INFO|CTL|container.Connect: - endpoint=
2023-01-23T21:16:55.4  |ERRO|CTL|main.main: Failed to initialize - error=Unknown container runtime
2023-01-23T21:16:55|MON|Process ctrl exit status 254, pid=7
2023-01-23T21:16:55|MON|Process ctrl exit with non-recoverable return code. Monitor Exit!!
Leave the cluster
{"level":"warning","msg":"OPA running with uid or gid 0. Running OPA with root privileges is not recommended.","time":"2023-01-23T21:16:55Z"}
Error leaving: Put "http://127.0.0.1:8500/v1/agent/leave": dial tcp 127.0.0.1:8500: connect: connection refused
2023-01-23T21:16:56|MON|Clean up.
❯ k logs neuvector-enforcer-pod-qr8gr -n neuvector                                                             ─╯
2023-01-23T21:16:12|MON|/usr/local/bin/monitor starts, pid=11579
net.core.somaxconn = 1024
net.unix.max_dgram_qlen = 64
Check TC kernel module ...
TC module located
2023-01-23T21:16:12|MON|Start dp, pid=11598
2023-01-23T21:16:12|MON|Start agent, pid=11599
1970-01-01T00:00:00|DEBU||dpi_dlp_init: enter
1970-01-01T00:00:00|DEBU||dpi_dlp_register_options: enter
1970-01-01T00:00:00|DEBU||net_run: enter
1970-01-01T00:00:00|DEBU|cmd|dp_ctrl_loop: enter
2023-01-23T21:16:12|DEBU|dlp|dp_bld_dlp_thr: dp bld_dlp thread starts
2023-01-23T21:16:12|DEBU|dp0|dpi_frag_init: enter
2023-01-23T21:16:12|DEBU|dp0|dpi_session_init: enter
2023-01-23T21:16:12|DEBU|dp0|dp_data_thr: dp thread starts
2023-01-23T21:16:12.774|INFO|AGT|main.main: START - version=v5.1.0
2023-01-23T21:16:12.775|INFO|AGT|main.main: - bind=10.1.5.241
2023-01-23T21:16:12.778|INFO|AGT|system.NewSystemTools: cgroup v1
2023-01-23T21:16:12.778|INFO|AGT|container.Connect: - endpoint=
2023-01-23T21:16:12.778|ERRO|AGT|main.main: Failed to initialize - error=Unknown container runtime
2023-01-23T21:16:12|MON|Process agent exit status 254, pid=11599
2023-01-23T21:16:12|MON|Process agent exit with non-recoverable return code. Monitor Exit!!
2023-01-23T21:16:12|MON|Kill dp with signal 15, pid=11598
2023-01-23T21:16:12|DEBU|dp0|dp_data_thr: dp thread exits
Leave the cluster
Error leaving: Put "http://127.0.0.1:8500/v1/agent/leave": dial tcp 127.0.0.1:8500: connect: connection refused
2023-01-23T21:16:12|MON|Clean up.

please submit an update code for us to rerun and approve this.

Also please add the functional test job to the PR.

rjschwei commented 1 year ago

@elamaran11 4th time is a charm. At least the deployment worked in a k3s test cluster.

w.r.t. the kustomizatio.yaml is there an example as to what you are looking for?

Thanks

elamaran11 commented 1 year ago

@rjschwei Please look at this link for Kustomization.yaml and grouping hr example https://github.com/aws-samples/eks-anywhere-addons/tree/main/eks-anywhere-snow/Addons/Core/storage-driver