vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.64k stars 1.56k forks source link

Vector operator for Kubernetes #768

Open tlvenn opened 5 years ago

tlvenn commented 5 years ago

Hi,

Not an issue strictly with vector but hoping to spark some ideas regarding the integration with k8s. An operator similar to what https://github.com/banzaicloud/logging-operator does for fluent-bit would be pretty awesome.

binarylogic commented 5 years ago

Thanks @tlvenn, the logging-operator project is very interesting. We'll dig in and see what we can do.

LucioFranco commented 4 years ago

Hi @tlvenn thanks for opening this. I did some research into this. This sounds like some that would be very helpful for our users.

From what I have been reading it looks like having a vector operator could provide a way to configure different Vector topologies running on top of kubernetes. We would be able to define a few CRD's that would allow users to express their Vector deployments ontop of kube with kube like configuration that integrates with kube very naturally.

Under the hood this Vector operator could deploy a daemonset of Vector onto each kube node. It could then supply each agent with a config that either allows vector to run in a distributed topology. On top of this, the operator could also then deploy a statefulset that will act as a central vector agent that allows the node agents to connect to itself and is in charge of providing a durable disk buffer and a way to output logs to some external service configuring TLS all the way. Also on top of this, this operator could provide a way to configure an entire log transform pipeline within the same cluster configuring kafka as an intermediate message broker. This would then provide the stream based topology setup.

An example of a simple deployment that fluentbit's logging operator does can be seen here. This configures fluentbit to write to s3 and shows that you can set up logging very easily and naturally.

Since this relates to how Vector is deployed, I think this might not belong in the current initial containers milestone. That said, I do think this could be extremely valuable to allow users to quickly get up and running with more complex vector deployments.

tlvenn commented 4 years ago

Yep that's pretty much it and the possibility are endless, beyond CRDs to describe the entire desired vector pipeline, the operator could also react to some annotations on a container to inject some logging vector sidecars but yes you definitely captured the gist of it.

This particular diagram illustrate best the whole idea: image

tlvenn commented 4 years ago

Note that the operator approach also pave the way for some IoC where the application bundles some custom CRDs to declare its logging streams and how they should be routed. Very similarly to the ServiceMonitor CRD with the Prometheus operator.

tlvenn commented 4 years ago

Hi @binarylogic, was wondering where your current thinking is regarding the idea of having a K8S Operator to orchestrate / deploy vector ?

binarylogic commented 4 years ago

Hi @tlvenn, @MOZGIII should be able to answer that since he is heading up our k8s integration currently.

MOZGIII commented 4 years ago

Hey @tlvenn! I'm currently working on a new k8s integration outlined at this RFC: https://github.com/timberio/vector/blob/master/rfcs/2020-04-04-2221-kubernetes-integration.md We don't yet have plans to implement Vector operator in the backlog, but we're working on the building blocks to enable that option for us in the future.

The scope of the Vector operator would be to use CRDs to describe and rollout complete logging topologies, i.e. deploy Vector for pod log collection as a DaemonSet on each node and to deploy another Vector as a Deployment to gather all the log streams from the whole cluster and do advanced processing. Another responsibility of the operator would be, effectively, to assemble Vector configuration from the CRDs (as opposed to user-supplied .toml configs). All that is very advanced functionality, and we're building the k8s integration with that in mind. We're currently working on prerequisites to make is possible.

Furthermore, instead of building our own operator, we might be able to work with https://github.com/banzaicloud/logging-operator to add Vector support there. We'll look into it after we complete the initial integration with k8s.

I'm personally looking forward to the scenario where apps can ship with their own logging configurations (i.e. transform specifications) as CRDs, and an operator dynamically configures Vector to process and ship the logs accordingly.

tlvenn commented 4 years ago

Awesome, thanks a lot for the feedback and looking forward for all of this as well !

aelbarkani commented 3 years ago

@binarylogic @MOZGIII @tlvenn I can start working on it to have a first version by the end of September.

jszwedko commented 3 years ago

cc/ @spencergilbert

spencergilbert commented 3 years ago

Hey @aelbarkani - I'm curious what you're looking to get out of an operator here. Is it going to solve a problem that the existing helm charts aren't?

aelbarkani commented 3 years ago

Hey @spencergilbert. Logging operator of Banzaicloud is a good example. We run multi-tenant clusters where each application team can have isolated namespaces in a cluster. Application teams have limited access to the cluster (they don't have access to cluster-wide resources nor privileged containers). And usually they don't care, the only thing the application teams want is to be admin in their namespaces, and leave the cluster management hassle to infrastructure team. In a Kubernetes cluster Vector agents are privileged, and thus end users should not have access to them. However, the users would need some sort of API (a Kubernetes Custom Resource) in order to request sources, transforms and sinks. Basically the idea is to be able to offer to our users a fully managed Vector instance in multi-tenant Kubernetes clusters, and I don't think that would be possible with a Helm chart.

spencergilbert commented 3 years ago

@aelbarkani - would you be interested in having a call with us to discuss your plans and thoughts around the operator more fully? If you do, you can email me at spencer.gilbert (at) datadoghq.com

tlvenn commented 3 years ago

Hi @aelbarkani , please take a moment to read the material and articles linked above, the value props of the vector operator should be pretty apparent.

aelbarkani commented 3 years ago

Hi @tlvenn. I must say that value proposition of an operator is not always straightforward. In this case it is, and I can say that since we've been using Banzaicloud's operator in dozens of clusters in production and developed many for our clients. Now, my comment was about one use case (the one that I'm interested in), but of course there are many others. Don't hesitate to explain a little bit more your use cases.

aelbarkani commented 3 years ago

@spencergilbert yep, just sent you an email !

tlvenn commented 3 years ago

Dho my comment was for @spencergilbert , I made a mistake mentioning you :facepalm:

aelbarkani commented 3 years ago

@tlvenn no worries !

tmckeon commented 2 years ago

Hi All

I would just like to add my support for this issue. It would be great if dev teams could configure vector transforms by themselves.

xinbinhuang commented 1 year ago

Hi all,

I want to points out there is another use case to better integrate or replace prometheus for metrics collection. Currently, a lot of helm charts ship with PodMonitor/ServiceMonitor to allow scraping by Prometheus. However, Vector can't utilize these directly. It would be awesome if the vector operator can monitor existing PodMonitor/ ServiceMonitor to scrape metrics when configured to do so.

zvlb commented 1 year ago

Hi. Please try this one - https://github.com/kaasops/vector-operator We released it for deploying and configuring Vector in Kubernetes. (Like how Logging Operator does it, but with some differences).

You can use CRDs: Vector - for deploy Vector instance VectorPipeline - for deploy sources/transforms/sinks in namespace scope ClusterVectorPipeline - for deploy sources/transforms/sinks in cluster scope

nabokihms commented 1 year ago

Deckhouse Kubernetes Platform has a log-shipper module, which is basically an operator that is built around vector.

Simple configuration example:

apiVersion: deckhouse.io/v1alpha1
kind: ClusterLoggingConfig
metadata:
  name: system-logs
spec:
  type: KubernetesPods
  kubernetesPods:
    namespaceSelector:
      matchNames:
        - kube-system
  destinationRefs:
    - loki-storage
---
apiVersion: deckhouse.io/v1alpha1
kind: ClusterLogDestination
metadata:
  name: loki-storage
spec:
  type: Loki
  loki:
    endpoint: http://loki.loki:3100