elastic / cloud-on-k8s

Elastic Cloud on Kubernetes
Other
2.57k stars 695 forks source link

Beat add spec for external service #3866

Open andelhie opened 3 years ago

andelhie commented 3 years ago

Proposal

Be to make Beat have a service for allow external collection of data by adding a spec for service loadbalancer

Use case. Why is this important?

Let say I am using ECK as a platform to collect many telemetry point from my network. To ingest lets say netflow we would need a service that has loadbalancer configured so external devices and feed netflow to ECK.

You can do this manually but it would be nice to have it as a spec.

david-kow commented 3 years ago

Hi @andelhie, thanks for your suggestion.

To make sure I understand correctly, you would like Beat to have something similar to what ES has, like below:

apiVersion: beat.k8s.elastic.co/v1beta1
...
spec:
  service:
    spec:
      type: LoadBalancer

This would create a service of the provided type and make all Pods of that Beat a target. Then you would be able to use the IP of this service from the outside of k8s. Is this description correct?

I see that at least for Filebeat, there is number of inputs that support pushing to the Beat: HTTP Endpoint, HTTP JSON, NetFlow, Syslog, TCP, UDP.

andelhie commented 3 years ago

This is very correct. It would allow the collection of all these items. I looking for syslog and NetFlow but the reset would be great so you can collect data from outside the k8 cluster extending the usecases.

pebrc commented 3 years ago

I see that at least for Filebeat, there is number of inputs that support pushing to the Beat: HTTP Endpoint, HTTP JSON, NetFlow, Syslog, TCP, UDP.

HTTP JSON does seem to request from a given URL but does not offer a listener.

mscbpi commented 3 years ago

I support this feature request. Filebeat example is based on the assumptions that an ECK will only deal with K8s workloads / logs. K8s, Operator and CRDs can be used to deploy and host Elastic, the service it provides can be part of a global K8s/non-K8s infrastructure.

At the moment we expose elastic with a LB (supported in spec) and we deployed a logstash instance also with an LB service for ingesting data that cannot be sent directly to elastic.

mscbpi commented 3 years ago

@andelhie I managed to get Palo Alto logs with according module enabled in an ECK's Beat. Service to be set apart from ECK at the moment but would be nice to have it integrated indeed.

https://discuss.elastic.co/t/syslog-to-eck/270010

poochwashere commented 3 years ago

+1 for this feature. I have a use case to collect syslogs from on prem network devices into my k8s ELK stack.

gittihub123 commented 1 year ago

Hi, I'm stuck with this use case and would like to deploy elastic-agent that is managed by fleet but without success.

Can I have some examples, documentation or anything else that can help me implement elastic-agent so we can receive syslog from external sources?

Thanks.

mscbpi commented 1 year ago

Run logstash in addition to ECK with a LB service on syslog port, and a connection to your ECK instance. Then you can send whatever you want through logstash to elastic on K8s run with ECK.

As per Beats, deploy them with ECK operator as documented, and add a LB service manually to expose to external sources. Configure the LB to match the port the Beats is listening to.

The latter would be great if included in Beats operator spec of ECK. LB for elastic is created with the operator, so could the Beats ones.

gittihub123 commented 1 year ago

Hi @mscbpi I want to be able to run elastic-agents that is connected to fleet and managed centrally. I want to be able to receive syslog to elastic-agents since they have all the integrations that I need.

Do you have any example configurations of this setup? It's always the standard setups from the examples that I see, but none that shows how to expose elastic-agent and make them receive syslog from external sources.

Thank you.

mscbpi commented 1 year ago

I don't know elastic agent specifically but I can see there is a doc to deploy with ECK, it is part of the API: https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-elastic-agent-configuration.html

If the API does not include a LB / service specs part to configure (just as with Beats), proceed with the following steps:

I am answering from GitHub mobile so I can't detail the config, but I'll review elastic-agent and your setup tomorrow (I don't know the product so it will be interesting in any case).

mscbpi commented 1 year ago

@gittihub123 I will try to set it up since I have a use case for this part of the stack as well. Elastic Agent example on ECK target system and k8s logging of the cluster itself: https://www.elastic.co/guide/en/cloud-on-k8s/1.4/k8s-elastic-agent-configuration-examples.html However, same as with Beats, to achieve what we want, I would simply get rid of the proposed input config to set the input I need, and that will be exposed.

Then, to expose it, we create a LoadBalancer object to match the exposed port of the pod and make it available from external sources - the purpose of this issue it that it would be nice to trigger this creation from the specs of Elastic Agent kind: Agent, it is possible for kind: ElasticSearch itself but not for Beats or Elastic Agents.

gittihub123 commented 11 months ago

Hi @mscbpi

I'm able to collect syslog from external sources (network devices etc) but we have issue when we want to collect TCP/HTTPS data to the elastic agents that are deployed as pods in the Openshift 4 cluster.

I created a service and tried to expose the pods with Openshift route but we are not able to make it work.

  1. First problem is the verify certificate chain part, since we are using selfSigned cert from our CA. We want to be able to verify the certificates before allowing the traffic. The elastic-agents deployed in Openshift are using cert that are signed from internal resources, so I don't think this is possible? We are using fleet that are using selfSigned cert from our CA, but are these forwarded to the elastic agents also, so they can collect data from external sources?
  2. We turned off the "verify certificate chain" option but still no traffic is forwarded to the elastic-agents pods. I created a service that are binding to the elastic-agents pods on port 8989, to be able to collect vmware vsphere data on TCP/8989. Then I created a Openshift route to established the HTTPS connection, but still not data is presented in the elastic-agent and elasticsearch.

I think the only option to collect TCP/HTTPS data with elastic agent is to run the elastic agents outside of the cluster, as vms or on physical hardware.

Any input?

Thanks.