elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
12.17k stars 4.92k forks source link

[Elastic Agent] Prototype a shipper gRPC input for Filebeat #35135

Closed cmacknz closed 1 year ago

cmacknz commented 1 year ago

Create a prototype input for Filebeat that implements the shipper API. The intent is to implement the data shipper as another instance of Filebeat so that we can completely reuse the existing Beat event pipeline and outputs.

image

The shipper input will be Elastic licensed and should be added to the https://github.com/elastic/beats/tree/main/x-pack/filebeat/input directory.

The shipper input should ideally start a single gRPC server and create a new beat.Client for each unique data stream. This will allow each data stream to have its own processor configuration and allows for shared processor pipelines to execute concurrently. This follows the pattern used in the AWS S3 input in https://github.com/elastic/beats/pull/33658.

We can explore other approaches if the ideal approach is initially too complex. The path of least resistance will likely be to have one instance of the shipper input for each input unit received from the Elastic Agent, where there is an input unit per component connected to the shipper. This will result in multiple gRPC server instances, but given that the shipper gRPC communication uses Unix domain sockets / named pipes this may be an acceptable overhead.

Acceptance Criteria

elasticmachine commented 1 year ago

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

cmacknz commented 1 year ago

The ideal outcome here is that an system or integration test exists proving that a Filebeat instance running the shipper output can communicate with a shipper instance running the shipper input.

fearful-symmetry commented 1 year ago

So, currently tinkering with filebeat to see how we can do this. There's a few open questions:

cmacknz commented 1 year ago

This is where the agent create's the shipper input configurations now: https://github.com/elastic/elastic-agent/blob/eaa9d0ac42986259e153482170e077ea8e191c5a/pkg/component/runtime/manager_shipper.go#L37-L51

cmacknz commented 1 year ago

Spoke today about how the design should work, I've updated the diagram in the description to match.

image

Summary of the implementation:

Alternatives considered:

blakerouse commented 1 year ago
  • The agent must set the type parameter in the input units it sends to the shipper to match the name of the shipper protocol input. This allows Filebeat to create and manage the lifecycle of the shipper inputs with no other modifications to the shipper. This is the code in the agent that needs to be modified.

This is actually fixed in this PR - https://github.com/elastic/elastic-agent/pull/2543

cmacknz commented 1 year ago

Great, then we just need to change the shipper spec file to have the name parameter match the input type we plan to run:

https://github.com/elastic/elastic-agent/blob/05b4c010ab20ef18a0f857a2de0eccb0bfac5d01/pkg/component/load.go#L174-L175

            shipperSpecs[shipper.Name] = ShipperRuntimeSpec{
                ShipperType: shipper.Name

https://github.com/elastic/elastic-agent-shipper/blob/70cd54ce1b7fa041d1fdec62335137ef08136d4b/elastic-agent-shipper.spec.yml#L3

shippers:
  - name: shipper
    description: "Elastic Agent Shipper"
    platforms: