elastic / elastic-agent

Elastic Agent - single, unified way to add monitoring for logs, metrics, and other types of data to a host.
Other
113 stars 126 forks source link

Add per-platform sample OTel configurations #4785

Closed strawgate closed 1 week ago

strawgate commented 1 month ago

We should include a folder in the Agent distribution called something like otel_samples or similiar that contains per-platform use-case based configurations.

Each config should have a name and there should be a readme which indicates what each config is used for.

During onboarding we will provide installation instructions that copy the chosen use-case-specific config out of the samples folder, replace otel.yml with it, and populate it with keys/endpoints:

curl --proto '=https' --tlsv1.2 -fOL https://snapshots.elastic.co/8.14.0-6b2f3648/downloads/beats/elastic-agent/elastic-agent-8.14.0-SNAPSHOT-darwin-x86_64.tar.gz
tar -xvf elastic-agent-8.13.4-darwin-x86_64.tar.gz
rm elastic-agent-8.13.4-darwin-x86_64/otel.yml && cp elastic-agent-8.13.4-darwin-x86_64/otel_templates/logs_hostmetrics.yaml elastic-agent-8.13.4-darwin-x86_64/otel.yml
sed -i '' 's/<<ES_ENDPOINT>>/'$elasticsearch_url'/g' elastic-agent-8.13.4-darwin-x86_64/otel.yml && sed -i '' 's/<<ES_API_KEY>>/'$api_key'/g' elastic-agent-8.13.4-darwin-x86_64/otel.yml

elastic-agent-8.13.4-darwin-x86_64/elastic-agent otel

Here are two examples of macOS sample configurations. I have not tested these and we should assume these will need to be updated and that we will need to provide platform specific working configurations.

Logs and Host Metrics

receivers:

  # Receiver for platform specific log files
  filelog/platformlogs:
    include: [ /var/log/*.log ]
#    start_at: beginning

  # Receiver for CPU, Disk, Memory, and Filesystem metrics
  hostmetrics/system:
    collection_interval: 30s
    scrapers:
      disk:
      filesystem:
      cpu:
      memory:

processors:
  elasticinframetrics:

exporters:

  elasticsearch/bulk:
    endpoints: [<<ES_ENDPOINT>>]
    api_key: <<ES_API_KEY>>
    logs_index: logs-generic-default
    metrics_index: metrics-generic-default

service:
  pipelines:
    metrics/hostmetrics:
      receivers: [hostmetrics/system]
      processors: [elasticinframetrics]
      exporters: [elasticsearch/bulk]

    logs/platformlogs:
      receivers: [filelog/platformlogs]
      processors: []
      exporters: [elasticsearch/bulk]

Logs, metrics, and traces:

receivers:

  # Receiver for platform specific log files
  filelog/platformlogs:
    include: [ /var/log/*.log ]
#    start_at: beginning

  # Receiver for CPU, Disk, Memory, and Filesystem metrics
  hostmetrics/system:
    collection_interval: 30s
    scrapers:
      disk:
      filesystem:
      cpu:
      memory:

  # Receiver for logs, traces, and metrics from SDKs
  otlp/fromsdk:
    protocols:
      grpc:
      http:

processors:
  elasticinframetrics:

exporters:

  otlp/apm:
    endpoint: <<APM_ENDPOINT>>
    headers:
    # Elastic APM Server secret token or API key
      Authorization: "Bearer <<APM_SECRET_KEY>>"

  elasticsearch/bulk:
    endpoints: [<<ES_ENDPOINT>>]
    api_key: <<ES_API_KEY>>
    logs_index: logs-generic-default
    metrics_index: metrics-generic-default

service:
  pipelines:
    traces/fromsdk:
      receivers: [otlp/fromsdk]
      processors: []
      exporters: [otlp/apm]

    metrics/fromsdk:
      receivers: [otlp/fromsdk]
      processors: []
      exporters: [otlp/apm]

    metrics/hostmetrics:
      receivers: [hostmetrics/system]
      processors: [elasticinframetrics]
      exporters: [elasticsearch/bulk]

    logs/fromsdk:
      receivers: [otlp/fromsdk]
      processors: []
      exporters: [otlp/apm]

    logs/platformlogs:
      receivers: [filelog/platformlogs]
      processors: []
      exporters: [elasticsearch/bulk]
elasticmachine commented 1 month ago

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

cmacknz commented 1 month ago

I have not tested these and we should assume these will need to be updated and that we will need to provide platform specific working configurations.

We should provide automated tests to ensure these default configurations keep working.

strawgate commented 1 month ago

We will need this for 8.15.0 and we will need to decide on the folder structure ASAP so we can start building onboarding instructions @ycombinator @cmacknz any thoughts?

cmacknz commented 1 month ago

otel_samples seems fine to me.

strawgate commented 1 month ago

Let me know when this is ready to start and I will provide the initial samples, the samples will need to be updated when the infra processor and initial metrics support gets merged

ycombinator commented 1 month ago

@strawgate It's ready to start, but it won't get worked on this week. Please provide the initial samples here in this issue whenever you get a chance. Thanks.

strawgate commented 1 month ago

Let's do this for x86_64 and aarch macOS and all distros Linux

platformlogs.yml

receivers:
  # Receiver for platform specific log files
  filelog/platformlogs:
    include: [ /var/log/*.log ]
#   start_at: beginning

exporters:

  # Exporter to print the first 5 logs/metrics and then every 1000th
  debug:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 1000

  # Exporter to send logs and metrics to Elasticsearch
  elasticsearch/bulk:
    endpoints: [<<ES_ENDPOINT>>]
    api_key: <<ES_API_KEY>>

service:
  pipelines:
    logs/platformlogs:
      receivers: [filelog/platformlogs]
      processors: []
      exporters: [debug, elasticsearch/bulk]

platformlogs_hostmetrics.yml

receivers:
  # Receiver for platform specific log files
  filelog/platformlogs:
    include: [ /var/log/*.log ]
#   start_at: beginning

  # Receiver for CPU, Disk, Memory, and Filesystem metrics
  hostmetrics/system:
    collection_interval: 30s
    scrapers:
      disk:
      filesystem:
      cpu:
      memory:

exporters:
  # Exporter to print the first 5 logs/metrics and then every 1000th
  debug:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 1000

  # Exporter to send logs and metrics to Elasticsearch
  elasticsearch/bulk:
    endpoints: [<<ES_ENDPOINT>>]
    api_key: <<ES_API_KEY>>

service:
  pipelines:
    metrics/hostmetrics:
      receivers: [hostmetrics/system]
      processors: []
      exporters: [debug, elasticsearch/bulk]
    logs/platformlogs:
      receivers: [filelog/platformlogs]
      processors: []
      exporters: [debug, elasticsearch/bulk]

Windows hostmetrics.yml

receivers:
  # Receiver for CPU, Disk, Memory, and Filesystem metrics
  hostmetrics/system:
    collection_interval: 30s
    scrapers:
      disk:
      filesystem:
      cpu:
      memory:

exporters:
  # Exporter to print the first 5 logs/metrics and then every 1000th
  debug:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 1000

  # Exporter to send logs and metrics to Elasticsearch
  elasticsearch/bulk:
    endpoints: [<<ES_ENDPOINT>>]
    api_key: <<ES_API_KEY>>

service:
  pipelines:
    metrics/hostmetrics:
      receivers: [hostmetrics/system]
      processors: []
      exporters: [debug, elasticsearch/bulk]
    logs/platformlogs:
      receivers: [filelog/platformlogs]
      processors: []
      exporters: [debug, elasticsearch/bulk]
strawgate commented 1 month ago

We should expect to update this template and add a second template before 8.14.2 release

jlind23 commented 3 weeks ago

@michalpristas I can see that this has been moved to the implementation state, any draft PR we can link here?

michalpristas commented 3 weeks ago

i havent moved it. nothing started

ycombinator commented 3 weeks ago

Moved it back out of implementation. Sorry for the confusion, not sure why I thought this work was already started 🤦.

AlexanderWert commented 3 weeks ago

@ycombinator @michalpristas

Related: https://github.com/elastic/opentelemetry-dev/pull/260

We need to consolidate that work. Also, @ChrsMark is working on a K8s manifest (which is also closely related to this)

AlexanderWert commented 3 weeks ago

(ups, sorry for accidentally closing it, hit the wrong button)

strawgate commented 3 weeks ago

@ycombinator @michalpristas

Related: elastic/opentelemetry-dev#260

We need to consolidate that work. Also, @ChrsMark is working on a K8s manifest (which is also closely related to this)

Agree, originally I had wanted to get something (anything) in and update it before the actual release

Let me know when this is ready to start and I will provide the initial samples, the samples will need to be updated when the infra processor and initial metrics support gets merged

I still think we should still get the per-platform config work completed (in any form) as soon as possible and update them when the configs are actually ready, before 8.14.2

strawgate commented 1 week ago

It looks like this got moved from 8.14.2? @ycombinator

ycombinator commented 1 week ago

It looks like this got moved from 8.14.2? @ycombinator

Yeah, sorry, I meant to move a different issue to 8.15.0 and moved this one instead. Moved it back to 8.14.2.