aws-observability / aws-otel-collector

AWS Distro for OpenTelemetry Collector (see ADOT Roadmap at https://github.com/orgs/aws-observability/projects/4)
https://aws-otel.github.io/
Other
573 stars 239 forks source link

Debug logs in on otel collector running inside eks #668

Closed sudhirpandey closed 2 years ago

sudhirpandey commented 3 years ago

Is your feature request related to a problem? Please describe. I am trying to get some debug logs to figure put why the metrics are not ending up in the managed prometheus in AWS. So far i have enabled debug on exporter side as

exporters:
      awsprometheusremotewrite:
        # replace this with your endpoint
        endpoint: "myendpoint"
        # replace this with your region
        aws_auth:
          region: "eu-west-1"
          service: "aps"
        namespace: "adot"
      logging:
        loglevel: debug

now realising that it might only show debug logs form exporter side, i wanted to get some debug logs in collector side, From upstream otel collector there is flag --log-level debug which is not supported in aws-otel-collector yet it seems , https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/troubleshooting.md

but we can enable debug logs using this file it seems /opt/aws/aws-otel-collector/etc/extracfg.txt

Hence i create a config map and mounted in the daemonset manifest like this

apiVersion: v1
data:
  extracfg.txt: |
    loggingLevel=DEBUG
kind: ConfigMap
metadata:
  name: extractconf
  namespace: adot-col

daemonset to have this excerpt in manifest

       terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /conf
          name: adot-collector-config-vol
        - mountPath: /opt/aws/aws-otel-collector/etc/
          name: extractconf-vol
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: amp-iamproxy-ingest-service-account
      serviceAccountName: amp-iamproxy-ingest-service-account
      terminationGracePeriodSeconds: 30
      volumes:
      - configMap:
          defaultMode: 420
          items:
          - key: adot-collector-config
            path: adot-collector-config.yaml
          name: adot-collector-conf
        name: adot-collector-config-vol
      - configMap:
          defaultMode: 420
          name: extractconf
        name: extractconf-vol

so far the file seems to be present as i am not getting logs like this 2021/10/06 17:33:26 find no extra config, skip it, err: open /opt/aws/aws-otel-collector/etc/extracfg.txt: no such file or directory,

but the logs genereated does not seem to be debug too

021/10/07 10:06:47 AWS OTel Collector version: v0.13.0
2021-10-07T10:06:47.231Z    info    service/collector.go:176    Applying configuration...
2021-10-07T10:06:47.232Z    info    builder/exporters_builder.go:265    Exporter was built. {"kind": "exporter", "name": "logging"}
2021-10-07T10:06:47.232Z    info    builder/exporters_builder.go:265    Exporter was built. {"kind": "exporter", "name": "awsprometheusremotewrite"}
2021-10-07T10:06:47.232Z    info    builder/pipelines_builder.go:214    Pipeline was built. {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2021-10-07T10:06:47.232Z    info    builder/receivers_builder.go:228    Receiver was built. {"kind": "receiver", "name": "prometheus", "datatype": "metrics"}
2021-10-07T10:06:47.232Z    info    service/service.go:101  Starting extensions...
2021-10-07T10:06:47.232Z    info    builder/extensions_builder.go:53    Extension is starting...    {"kind": "extension", "name": "pprof"}
2021-10-07T10:06:47.232Z    info    pprofextension@v0.36.0/pprofextension.go:78 Starting net/http/pprof server  {"kind": "extension", "name": "pprof", "config": {"TCPAddr":{"Endpoint":":1888"},"BlockProfileFraction":0,"MutexProfileFraction":0,"SaveToFile":""}}
2021-10-07T10:06:47.232Z    info    builder/extensions_builder.go:59    Extension started.  {"kind": "extension", "name": "pprof"}
2021-10-07T10:06:47.232Z    info    builder/extensions_builder.go:53    Extension is starting...    {"kind": "extension", "name": "zpages"}
2021-10-07T10:06:47.232Z    info    zpagesextension/zpagesextension.go:40   Register Host's zPages  {"kind": "extension", "name": "zpages"}
2021-10-07T10:06:47.232Z    info    zpagesextension/zpagesextension.go:53   Starting zPages extension   {"kind": "extension", "name": "zpages", "config": {"TCPAddr":{"Endpoint":":55679"}}}
2021-10-07T10:06:47.232Z    info    builder/extensions_builder.go:59    Extension started.  {"kind": "extension", "name": "zpages"}
2021-10-07T10:06:47.232Z    info    builder/extensions_builder.go:53    Extension is starting...    {"kind": "extension", "name": "health_check"}
2021-10-07T10:06:47.232Z    info    healthcheckextension@v0.36.0/healthcheckextension.go:40 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Port":0,"TCPAddr":{"Endpoint":"0.0.0.0:13133"}}}
2021-10-07T10:06:47.232Z    info    builder/extensions_builder.go:59    Extension started.  {"kind": "extension", "name": "health_check"}
2021-10-07T10:06:47.232Z    info    service/service.go:106  Starting exporters...
2021-10-07T10:06:47.232Z    info    builder/exporters_builder.go:92 Exporter is starting... {"kind": "exporter", "name": "logging"}
2021-10-07T10:06:47.232Z    info    builder/exporters_builder.go:97 Exporter started.   {"kind": "exporter", "name": "logging"}
2021-10-07T10:06:47.232Z    info    builder/exporters_builder.go:92 Exporter is starting... {"kind": "exporter", "name": "awsprometheusremotewrite"}
2021-10-07T10:06:47.233Z    info    builder/exporters_builder.go:97 Exporter started.   {"kind": "exporter", "name": "awsprometheusremotewrite"}
2021-10-07T10:06:47.233Z    info    service/service.go:111  Starting processors...
2021-10-07T10:06:47.233Z    info    builder/pipelines_builder.go:51 Pipeline is starting... {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2021-10-07T10:06:47.233Z    info    builder/pipelines_builder.go:62 Pipeline is started.    {"pipeline_name": "metrics", "pipeline_datatype": "metrics"}
2021-10-07T10:06:47.233Z    info    service/service.go:116  Starting receivers...
2021-10-07T10:06:47.233Z    info    builder/receivers_builder.go:70 Receiver is starting... {"kind": "receiver", "name": "prometheus"}
2021-10-07T10:06:47.233Z    info    kubernetes/kubernetes.go:282    Using pod service account via in-cluster config {"kind": "receiver", "name": "prometheus", "level": "info", "discovery": "kubernetes"}
2021-10-07T10:06:47.235Z    info    discovery/manager.go:195    Starting provider   {"kind": "receiver", "name": "prometheus", "level": "debug", "provider": "kubernetes/0", "subs": "[kubernetes-service-endpoints]"}
2021-10-07T10:06:47.239Z    info    builder/receivers_builder.go:75 Receiver started.   {"kind": "receiver", "name": "prometheus"}
2021-10-07T10:06:47.239Z    info    healthcheck/handler.go:129  Health Check state change   {"kind": "extension", "name": "health_check", "status": "ready"}
2021-10-07T10:06:47.239Z    info    service/telemetry.go:65 Setting up own telemetry...
2021-10-07T10:06:47.241Z    info    service/telemetry.go:113    Serving Prometheus metrics  {"address": ":8888", "level": 0, "service.instance.id": "5a240d6c-015b-4fcf-be38-865ecaa7fb7f"}
2021-10-07T10:06:47.241Z    info    service/collector.go:230    Starting aws-otel-collector...  {"Version": "v0.13.0", "NumCPU": 2}
2021-10-07T10:06:47.241Z    info    service/collector.go:134    Everything is ready. Begin running and processing data.

Describe the solution you'd like I wanted debug logs to figure out if otd collector is having trouble to scrape metrics from the application end points or if its actuallty having trouble pushing it to the managened prometheus. Form the logs generated now it just seems it runninng and processing data, but in reality i am not seeing any metrics data ending up in our managed prometheus

Describe alternatives you've considered Is there a possiblity to support the flag --log-level debug as in https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/troubleshooting.md in future versions, as current way of only supporting debug via config file is a bit too much to do , when we just wanted to quickly turn on loglevel verbosity for debugging purpose.

khanhntd commented 2 years ago

Hey @sudhirpandey, thanks for notifying the issue. Would you kindly answer these following questions:

  1. What post are you following to setup the ADOT Collector with EKS and AMP?
  2. Can you show us also the configuration file and if not, can you help us checking if the exporter contains the logging option in the exporter are in the ADOT Collector? I have also attached the following sample.
    service:
    pipelines:
    metrics:
      receivers: [prometheus]
      exporters: [logging, awsprometheusremotewrite]
    metrics/ecs:
      receivers: [awsecscontainermetrics]
      processors: [filter]
      exporters: [logging, awsprometheusremotewrite]

For the flag --log-level debug, I will notify my team about this option and thanks for your suggestion

Aneurysm9 commented 2 years ago

The --log-level CLI flag has been deprecated and removed upstream as part of an effort to move all configuration into the configuration system. There is an alternate method, which is what is ultimately used by the extracfg.txt method, which is to use the --set CLI flag:

--set=service.telemetry.logs.level=DEBUG

That said, we're currently tracking a report upstream that setting that configuration value is ineffective. You can follow this issue for more visibility there.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

github-actions[bot] commented 2 years ago

This issue was closed because it has been marked as stall for 30 days with no activity.