elastic / elastic-agent

Elastic Agent - single, unified way to add monitoring for logs, metrics, and other types of data to a host.
Other
117 stars 129 forks source link

Support `service.name` as first class citizen #2724

Open sorenlouv opened 1 year ago

sorenlouv commented 1 year ago

The concept of "services" was originally introduced in APM but should not be limited to this domain. The plan is to make logs "service-aware" (where applicable), and thus make it easier to investigate logs per service, and correlated with traces in APM.

To do this, logs must be annotated with service.name. Currently, logs are ingested without any reference to services, and users have to manually add the add_fields processor or similar to annotate their logs.

The objective with this issue, is to make it as easy as possible for customers to annotate their logs with service.name. Instead of requiring them to use processors, they should simply be able to specify the service.name in their logs configuration:

inputs:
  - id: custom-logs-1684919299044
+   service:
+     name: My Service
    type: logfile
    data_stream:
      namespace: default
    streams:
      - id: logs-onboarding-example
        data_stream:
          dataset: example
        paths:
          - /path/to/file

What would be the next steps for making the necessary changes to Elastic Agent?

felixbarny commented 1 year ago

Seems like there's an option that is at least a little bit more convenient compared to processors:

inputs:
  - id: custom-logs-1684919299044
+   fields_under_root: true
+   fields:
+     service:
+       name: My Service
    type: logfile
    data_stream:
      namespace: default
    streams:
      - id: logs-onboarding-example
        data_stream:
          dataset: example
        paths:
          - /path/to/file

However, I'm not entirely sure if that's even possible with Elastic Agent. I got that from the Filebeat docs.

cmacknz commented 1 year ago

Do we only want to make this change for the Filebeat inputs or log inputs specifically? If so then this is really a change in Beats. Elastic Agent just passes the relevant input section of the policy to Filebeat for log input types without looking at any part of it other than the typefield.

We could support something like service.name: "My Service" as proposed in the original description in the Filebeat input configuration as syntactic sugar for creating the needed add_fields processor. It would not be much more complicated than https://github.com/elastic/beats/pull/35287. The Beats already set service.name to the name of the Beat, but we can just have this override that. Here's an example log

{"log.level":"error","@timestamp":"2023-05-22T00:30:04.222Z","message":"Error dialing x509: certificate signed by unknown authority","component":{"binary":"filebeat","dataset":"elastic_agent.filebeat","id":"filestream-default","type":"filestream"},"log":{"source":"filestream-default"},"network":"tcp","ecs.version":"1.6.0","log.logger":"esclientleg","log.origin":{"file.line":38,"file.name":"transport/logging.go"},"service.name":"filebeat","address":"metrics.es.our.ece.domain.edu.au:9243","ecs.version":"1.6.0"}\

If we want to support this for all Beat input types, it is more work but probably still not unreasonable. If we want to support this for all Elastic Agent inputs beyond just Beats than there is much more work to drive everything to a consistent implementation.

cmacknz commented 1 year ago

In the example above, Filebeat would be sent this entire section of the agent policy (plus the output section that applies) and Filebeat would generate a Beat configuration from it:

    id: custom-logs-1684919299044
    service:
      name: My Service
    type: logfile
    data_stream:
      namespace: default
    streams:
      - id: logs-onboarding-example
        data_stream:
          dataset: example
        paths:
          - /path/to/file
ruflin commented 1 year ago

The Beats already set service.name to the name of the Beat, but we can just have this override that. Here's an example log

Having service.name for its own Beats log seems correct. But it should not be set on the data sent. Is this the case?

Taking https://github.com/elastic/elastic-agent/issues/2416 into account, the final config would look as following:

inputs:
  - type: logfile
    service.name: "foo"
    # Should this be set automatically if `service.name` foo?
    data_stream.dataset: "foo"
    paths:
      - /var/log/my-file/my.log*

Ideally, service.name would be supported on all inputs, including the ones in Metricbeat. But we can start with log. I need to check in detail if https://github.com/elastic/beats/pull/35287 would also work for this as in https://github.com/elastic/beats/pull/35287 the target index is not modified and here we would have to modify data. Hopefully there is a place to hook into it.

cmacknz commented 1 year ago

Having service.name for its own Beats log seems correct. But it should not be set on the data sent. Is this the case?

Correct service.name is not added automatically to the non-elastic_agent datastreams, for example is it omitted on logs-system.syslog-*. I am too used to looking at agent logs all day so that's the first example I looked at.

Ideally, service.name would be supported on all inputs, including the ones in Metricbeat.

All we should need to do is automatically create an add_fields processor for each input that specifies service.name in its input configuration. There are a few examples of how to create a processor like this if you need one.

ruflin commented 1 year ago

There are a few examples of how to create a processor like this if you need one.

If you have a link to one, that would be helpful.

sorenlouv commented 1 year ago

We could support something like service.name: "My Service" as proposed in the original description in the Filebeat input configuration as syntactic sugar for creating the needed add_fields processor.

That sounds great!

cmacknz commented 1 year ago

If you have a link to one, that would be helpful.

There is an existing add_data_stream processor for adding the datastream type, dataset, and namespace fields to an event you can use as a reference.

Here is one example usage of it: https://github.com/elastic/beats/blob/fb25982c80fb68745cff05a6a6a07a5c1e1ab4e7/x-pack/osquerybeat/internal/pub/publisher.go#L95-L119

The processor implementation itself is in https://github.com/elastic/beats/blob/fb25982c80fb68745cff05a6a6a07a5c1e1ab4e7/libbeat/processors/add_data_stream/add_data_stream.go#L68

nimarezainia commented 2 months ago

@cmacknz who should own this issue from an implementation perspective? we have all the processors required in this case.

cmacknz commented 2 months ago

This one is fairly straight forward, I don't think there's much specialized knowledge required.

Either of the Elastic Agent teams could certainly do it, but anyone capable of working in the Beats repository could take care of this.

sorenlouv commented 2 months ago

This one is fairly straight forward, I don't think there's much specialized knowledge required.

This is great to hear! The primary goal with this issue is increase the number of clusters where logs are annotated with service.name. This means reducing the technical barriers to setting service.name and highlighting this capability in documentation, guides and onboarding flows.