sysflow-telemetry / sysflow

SysFlow documentation and issues tracker
Other
44 stars 10 forks source link

Implement collection of cloud metadata #79

Closed ghost closed 1 year ago

ghost commented 2 years ago

Project: sf-collector

Problem statement: Currently SysFlow information gives detailed telemetry in the scope of containers. This is cool, but for cloud applications eg in a kubernetes setting, it would be very interesting to understand the context the specific container is situated in, possibly being a small part (micro-service) of a larger cloud application.

Description: It would be very attractive to collect this data together with the other SysFlow telemetry directly in the sf-collecor. Ideally this would include information about

Alternative: It would be possible to query the state of a k8s cluster continuously via its API (as per 'watch' functionality) or use the audit functionality. Both is some overhead, in performance for the cluster as well as in handling the events and stitching them together into a useful model.

terylt commented 2 years ago

I'm implementing a version of the collector that supports k8s meta data here:

https://github.com/sysflow-telemetry/sf-collector/tree/k8s_pod_integration

It uses SysFlow objects generated from this branch in the sf-apis:

https://github.com/sysflow-telemetry/sf-apis/tree/pod_object/avro/avdl

Specifically there are two new objects present:

https://github.com/sysflow-telemetry/sf-apis/blob/pod_object/avro/avdl/entity/pod.avdl and https://github.com/sysflow-telemetry/sf-apis/blob/pod_object/avro/avdl/event/k8sevent.avdl

I might also add a K8sNode entity object. The watch events supported will be related to the following k8s objects: K8S_NODES, K8S_NAMESPACES, K8S_PODS, K8S_REPLICATIONCONTROLLERS, K8S_SERVICES, K8S_EVENTS, K8S_REPLICASETS, K8S_DAEMONSETS, K8S_DEPLOYMENTS

These objects will only be watched for the k8s node on which the agent is deployed.