cncf / cnf-testbed

ARCHIVED: 🧪🛏️Cloud-native Network Function (CNF) Testbed --> See LFN Cloud Native Telecom Initiative https://wiki.lfnetworking.org/pages/viewpage.action?pageId=113213592
https://wiki.lfnetworking.org/pages/viewpage.action?pageId=113213592
Apache License 2.0
163 stars 51 forks source link

auditlogs for CNF testbed runs #201

Closed hh closed 5 years ago

hh commented 5 years ago
In order to get visibility into the stability
And API usage of core/beta/alpha k8s APIs for CNFs
As a cnf-testbed developer
I would like to generate audit-logs for loading into APISnoop

Given a CNF testbed run
When the run is complete
Then I have a copy of the auditlog

Do we store the results in a GCS bucket? Is the testbed only run every once in a while or do we have a CI job in place? If not I get a copy of the auditlogs for the next CNF testbed runs?

taylor commented 5 years ago

Hi, @hh.

Audit logs are not saved. The testbed is being deployed on demand only. CI deploys and benchmark tests will be a future item we will implement.

What are the specific commands, logs and a format you are wanting?

If there are no changes to the existing clusters or deployment code required, as a one time action, I'll look into getting auditlogs for you when we deploy a new testbed and run a set of tests.

If changes are needed please list those and we can priotize with other items.

hh commented 5 years ago

There are two approaches.

I'll take a peek at the deployment code and see if I can suggest that first, rather than running the dynamic audit-webhook destination service.

Adding a config file and two kube-apiserver flags

From https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#log-backend : --audit-log-path /path/on/each/apiserver/audit.log --audit-policy-file /path/to/a/policy-file.yaml

You would then need to retrieve the audit.log from each master/apiserver node.

Running / Configuring a Dynamic Audit webhook-backend and corresponding audit-sink

The Dynamic backend is configured via kubectl after cluster is running, but before loading applications we are interested in onto the cluster.

https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#webhook-backend https://kubernetes.io/docs/tasks/debug-application-cluster/audit/#dynamic-backend

This requires running a backend that consumes / records the audit logs.

We wrote a small example server for this early on in APISnoop development and are looking at creating something simpler to deploy soon:

https://github.com/cncf/apisnoop/tree/master/dev/kubeadm#kubeadm-development-for-audit-webhook

hh commented 5 years ago

I created https://github.com/cncf/cnf-testbed/issues/201 to track setting up something simple that would work as an dynamic backend deployed to the cluster itself.

I suspect since it's an alpha feature in 1.13, we'll have to wait until you are using 1.13, and even then we'll need to enable the appropriate feature flag.

hh commented 5 years ago

Some notes from conversations w/ @taylor

hh commented 5 years ago

@wavell you wanted to pair on something last we spoke. Would this be a good item to pick up and try?

hh commented 5 years ago

I tested that this PR updates cross-cloud provisioner to create audit logs on the apiservers: https://github.com/crosscloudci/cross-cloud/pull/187

If this looks good to you, would you be willing to run this on your next CNF Testbed deploy?

I'd like to load the resulting apiserver audit-logs into APISnoop to ensure that the APIs used by the CNF Testbed get some visibility within the Conformance WG on Thursday.

taylor commented 5 years ago

Merged https://github.com/crosscloudci/cross-cloud/pull/187. That started auto build and push of the image to the registry https://github.com/crosscloudci/build/blob/master/.gitlab-ci.yml#L4 - https://gitlab.cidev.cncf.ci/cncf/build/-/jobs/146915.

The CNF Testbed K8s deploy scripts use the master container image https://github.com/cncf/cnf-testbed/blob/master/tools/deploy_k8s_cluster.sh#L47 so new deploys will have those audit logs available.

@hh, we are not planning any CNF Testbed deploy this week. Feel free to follow the k8s deploy steps to deploy a k8s cluster on Packet. Additional details from a demo at ONS are in https://github.com/cncf/cnf-testbed/tree/master/comparison/ons_sj_2019

hh commented 5 years ago

/cc @devaii

taylor commented 5 years ago

@hh @devaii docker run -ti registry.cidev.cncf.ci/cncf/cross-cloud/provisioning:master ls /cncf/master_templates-v1.13.0-ubuntu/audit-policy.yaml shows /cncf/master_templates-v1.13.0-ubuntu/audit-policy.yaml

devaips commented 5 years ago

Following the instructions in k8s_deploy.md succeeds in setting up packet boxes, and provisioning the k8s cluster.

When it gets to the k8s vpp vswitch installer it fails on the third task in the ansible-playbook:

PLAY [localhost] *************************************************************************************************************************

TASK [Gathering Facts] *******************************************************************************************************************
ok: [localhost]

TASK [packet_l2 : Show vlan data before create] ******************************************************************************************
ok: [localhost] => {
    "vlans": {
        "vlan1": {
            "interface": "eth1"
        }, 
        "vlan2": {
            "interface": "eth1"
        }
    }
}

TASK [packet_l2 : Create or find a Packet VLAN] ******************************************************************************************
failed: [localhost] (item={'value': {u'interface': u'eth1'}, 'key': u'vlan1'}) => {"changed": true, "cmd": ["ruby", "/packet_api/l2_packet_networking.rb", "--create-vlan=vlan1", "--project-name=CNCF CNFs", "--packet-url=api.packet.net", "--facility=ewr1"], "delta": "0:00:02.338618", "end": "2019-05-01 22:55:11.567218", "item": {"key": "vlan1", "value": {"interface": "eth1"}}, "msg": "non-zero return code", "rc": 1, "start": "2019-05-01 22:55:09.228600", "stderr": "/packet_api/l2_packet_networking.rb:263:in `<main>': undefined method `find' for nil:NilClass (NoMethodError)", "stderr_lines": ["/packet_api/l2_packet_networking.rb:263:in `<main>': undefined method `find' for nil:NilClass (NoMethodError)"], "stdout": "", "stdout_lines": []}
failed: [localhost] (item={'value': {u'interface': u'eth1'}, 'key': u'vlan2'}) => {"changed": true, "cmd": ["ruby", "/packet_api/l2_packet_networking.rb", "--create-vlan=vlan2", "--project-name=CNCF CNFs", "--packet-url=api.packet.net", "--facility=ewr1"], "delta": "0:00:01.752388", "end": "2019-05-01 22:55:13.414363", "item": {"key": "vlan2", "value": {"interface": "eth1"}}, "msg": "non-zero return code", "rc": 1, "start": "2019-05-01 22:55:11.661975", "stderr": "/packet_api/l2_packet_networking.rb:263:in `<main>': undefined method `find' for nil:NilClass (NoMethodError)", "stderr_lines": ["/packet_api/l2_packet_networking.rb:263:in `<main>': undefined method `find' for nil:NilClass (NoMethodError)"], "stdout": "", "stdout_lines": []}

PLAY RECAP *******************************************************************************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=1   

0 minutes and 9 seconds elapsed - VPP vSwitch Deploy.

It looks like the playbook is using default variables eg. --projectname= CNCF CNFs, but even when I hardcode the correct values, it results in the same error.

Additional info: The host which I'm running the provisioning from has its DNS resolver set to 147.75.69.23, as required.

devaips commented 5 years ago

Correction: Hardcoding the values does get past that error after-all.

linkous8 commented 5 years ago

Project name should be settable with the PACKET_PROJECT_NAME environment variable here: https://github.com/cncf/cnf-testbed/blob/102ca9a0eded700dc67254390316faa3d82ac30b/tools/k8s-cluster.env.example#L17

devaips commented 5 years ago

Thanks @linkous8, I had not yet seen the update to the k8s-cluster.env.example.

With the addition of PACKET_PROJECT_NAME and change from FACILITY => PACKET_FACILITY things are working.