open-telemetry / opentelemetry-collector

OpenTelemetry Collector
https://opentelemetry.io
Apache License 2.0
4.25k stars 1.41k forks source link

Exporting failed when exporting data #10718

Open CzyerChen opened 1 month ago

CzyerChen commented 1 month ago

Exporting failed when exporting data and hostname missing

\otelcol-contrib_0.105.0> .\otelcol-contrib.exe --config=.\config.yaml
2024-07-24T14:36:46.550+0800    info    service@v0.105.0/service.go:116 Setting up own telemetry...
2024-07-24T14:36:46.550+0800    info    service@v0.105.0/service.go:119 OpenCensus bridge is disabled for Collector telemetry and will be removed in a future version, use --feature-gates=-service.disableOpenCensusBridge to re-enable
2024-07-24T14:36:46.550+0800    info    service@v0.105.0/telemetry.go:96        Serving metrics {"address": ":8888", "metrics level": "Normal"}
2024-07-24T14:36:46.551+0800    info    service@v0.105.0/service.go:198 Starting otelcol-contrib...     {"Version": "0.105.0", "NumCPU": 12}
2024-07-24T14:36:46.551+0800    info    extensions/extensions.go:34     Starting extensions...
2024-07-24T14:36:46.552+0800    info    prometheusreceiver@v0.105.0/metrics_receiver.go:307     Starting discovery manager      {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
2024-07-24T14:36:58.475+0800    info    prometheusreceiver@v0.105.0/metrics_receiver.go:285     Scrape job added        {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "jobName": "win-monitoring"}
2024-07-24T14:36:58.476+0800    info    service@v0.105.0/service.go:224 Everything is ready. Begin running and processing data.
2024-07-24T14:36:58.476+0800    info    prometheusreceiver@v0.105.0/metrics_receiver.go:376     Starting scrape manager {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
2024-07-24T14:36:58.476+0800    info    localhostgate/featuregate.go:63 The default endpoints for all servers in components have changed to use localhost instead of 0.0.0.0. Disable the feature gate to temporarily revert to the previous default.    {"feature gate ID": "component.UseLocalHostAsDefaultHost"}
2024-07-24T14:37:22.033+0800    info    exporterhelper/retry_sender.go:118      Exporting failed. Will retry the request after interval.        {"kind": "exporter", "data_type": "metrics", "name": "otlp", "error": "rpc error: code = DeadlineExceeded desc = context deadline exceeded", "interval": "5.622010313s"}
2024-07-24T14:37:26.976+0800    info    otelcol@v0.105.0/collector.go:318       Received signal from OS {"signal": "interrupt"}
2024-07-24T14:37:26.976+0800    info    service@v0.105.0/service.go:261 Starting shutdown...
2024-07-24T14:37:26.977+0800    error   exporterhelper/queue_sender.go:90       Exporting failed. Dropping data.        {"kind": "exporter", "data_type": "metrics", "name": "otlp", "error": "interrupted due to shutdown: rpc error: code = DeadlineExceeded desc = context deadline exceeded", "dropped_items": 8076}

Steps to reproduce

  1. download win pkg from github release page
  2. unzip the package
  3. design the config.yaml
  4. start grpc server with host - port 127.0.0.1:11800
    
    Running Mode                               |   null
    TTL.metrics                                |   7
    TTL.record                                 |   3
    Version                                    |   10.0.1-6a9d727
    module.agent-analyzer.provider             |   default
    module.ai-pipeline.provider                |   default
    module.alarm.provider                      |   default
    module.aws-firehose.provider               |   default
    module.cluster.provider                    |   standalone
    module.configuration-discovery.provider    |   default
    module.configuration.provider              |   none
    module.core.provider                       |   default
    module.debugging-query.provider            |   default
    module.envoy-metric.provider               |   default
    module.event-analyzer.provider             |   default
    module.log-analyzer.provider               |   default
    module.logql.provider                      |   default
    module.promql.provider                     |   default
    module.query.provider                      |   graphql
    module.receiver-browser.provider           |   default
    module.receiver-clr.provider               |   default
    module.receiver-ebpf.provider              |   default
    module.receiver-event.provider             |   default
    module.receiver-jvm.provider               |   default
    module.receiver-log.provider               |   default
    module.receiver-meter.provider             |   default
    module.receiver-otel.provider              |   default
    module.receiver-profile.provider           |   default
    module.receiver-register.provider          |   default
    module.receiver-sharing-server.provider    |   default
    module.receiver-telegraf.provider          |   default
    module.receiver-trace.provider             |   default
    module.service-mesh.provider               |   default
    module.storage.provider                    |   h2
    module.telemetry.provider                  |   none
    oap.external.grpc.host                     |   127.0.0.1
    oap.external.grpc.port                     |   11800
    oap.external.http.host                     |   127.0.0.1
    oap.external.http.port                     |   12800
    oap.internal.comm.host                     |   127.0.0.1
    oap.internal.comm.port                     |   11800
6. startup by `.\otelcol-contrib.exe --config=.\config.yaml`, 

receivers: prometheus: config: scrape_configs:

exporters: otlp: endpoint: 127.0.0.1:11800 tls: insecure: true insecure_skip_verify: true service: pipelines: metrics: receivers:

scrape from 127.0.0.1:9182,

# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 0
go_gc_duration_seconds{quantile="0.25"} 0
go_gc_duration_seconds{quantile="0.5"} 0
go_gc_duration_seconds{quantile="0.75"} 0
go_gc_duration_seconds{quantile="1"} 0.0085149
go_gc_duration_seconds_sum 0.0797549
go_gc_duration_seconds_count 1137
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 13
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.21.9"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 5.747496e+06
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 9.08954304e+08
# HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table.
# TYPE go_memstats_buck_hash_sys_bytes gauge
go_memstats_buck_hash_sys_bytes 1.496429e+06

and export to 127.0.0.1:11800

What did you expect to see?

export to gRpc port successfully and hostname can shown.

What did you see instead?

print many exporting failed logs, and no instance or hostname found.

What version did you use?

0.105.0

What config did you use?

receivers:
  prometheus:
    config:
     scrape_configs:
       - job_name: 'win-monitoring'
         scrape_interval: 30s
         static_configs:
           - targets: ['127.0.0.1:9182']
             labels:
               host_name: prometheus-os-win
processors:
  batch:

exporters:
  otlp:
    endpoint: 127.0.0.1:11800
    tls:
      insecure: true
      insecure_skip_verify: true
service:
  pipelines:
    metrics:
      receivers:
      - prometheus
      processors:
      - batch
      exporters:
      - otlp

Environment

Additional context

Maybe some security problems in win11 made hostname missing? If add labels node_identifier_host_name: localhost manually, it works fine.

CzyerChen commented 1 month ago

There is no net_host_name or node_identifier_host_name automatically added to labels

metric :Gauge(super=Metric(name=windows_service_status, labels={net_host_port=9182, job_name=windows-monitoring, name=wpnuserservice_152313, server_port=9182, http_scheme=http, service_instance_id=127.0.0.1:9182, url_scheme=http, status=starting}, timestamp=1721900262446), value=0.0)