Closed Cluas closed 6 months ago
@Cluas
In the updated version for span metrics, the correct configuration for span metrics does not include latency_histogram_buckets. Instead, the accurate configuration is:
`spanmetrics:
histogram:
explicit:
buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
dimensions_cache_size: 1500
`
@Cluas, If we configure the service as described in the readme, we encounter an error:
panic: invalid access to shared data
For the service fix, the configuration should resemble this:
`service:
extensions: [pprof, zpages, health_check]
pipelines:
logs:
receivers: [fluentforward, otlp]
processors: [memory_limiter, resourcedetection/system, resource, batch]
exporters: [qryn]
traces:
receivers: [otlp, jaeger, zipkin]
processors: [memory_limiter, resourcedetection/system, resource, batch]
exporters: [qryn]
traces/spanmetrics:
receivers: [ otlp, jaeger, zipkin ]
exporters: [spanmetrics]
traces/servicegraph:
receivers: [ otlp, jaeger, zipkin ]
exporters: [servicegraph]
metrics:
receivers: [prometheus]
processors: [memory_limiter, resourcedetection/system, resource, batch]
exporters: [qryn]
metrics/spanmetrics:
receivers: [spanmetrics]
processors: [ memory_limiter, resourcedetection/system, resource, batch ]
exporters: [ qryn ]
metrics/servicegraph:
receivers: [servicegraph]
processors: [ memory_limiter, resourcedetection/system, resource, batch ]
exporters: [qryn]`
@Cluas Configuration file details are
`receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
jaeger:
protocols:
grpc:
endpoint: 0.0.0.0:14250
thrift_http:
endpoint: 0.0.0.0:14268
zipkin:
endpoint: 0.0.0.0:9411
fluentforward:
endpoint: 0.0.0.0:24224
prometheus:
config:
scrape_configs:
- job_name: 'otel-collector'
scrape_interval: 5s
static_configs:
- targets: ['exporter:8080']
processors:
batch:
send_batch_size: 10000
timeout: 5s
memory_limiter:
check_interval: 2s
limit_mib: 1800
spike_limit_mib: 500
resourcedetection/system:
detectors: ['system']
system:
hostname_sources: ['os']
resource:
attributes:
- key: service.name
value: "serviceName"
action: upsert
metricstransform:
transforms:
- include: calls_total
action: update
new_name: traces_spanmetrics_calls_total
- include: latency
action: update
new_name: traces_spanmetrics_latency
connectors:
spanmetrics:
histogram:
explicit:
buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
dimensions_cache_size: 1500
servicegraph:
metrics_exporter: otlp/servicegraph
latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 100ms, 250ms]
dimensions:
- randomContainer
store:
ttl: 2s
max_items: 200
exporters:
qryn:
dsn: clickhouse://default:************@nl.ch.hepic.tel:******/cloki
timeout: 10s
sending_queue:
queue_size: 100
retry_on_failure:
enabled: true
initial_interval: 5s
max_interval: 30s
max_elapsed_time: 300s
logs:
format: json
otlp/spanmetrics:
endpoint: localhost:4317
tls:
insecure: true
prometheus/servicegraph:
endpoint: localhost:9090
namespace: servicegraph
extensions:
health_check:
pprof:
zpages:
service:
extensions: [pprof, zpages, health_check]
pipelines:
logs:
receivers: [fluentforward, otlp]
processors: [memory_limiter, resourcedetection/system, resource, batch]
exporters: [qryn]
traces:
receivers: [otlp, jaeger, zipkin]
processors: [memory_limiter, resourcedetection/system, resource, batch]
exporters: [qryn,spanmetrics,servicegraph]
metrics:
receivers: [prometheus,spanmetrics, servicegraph]
processors: [memory_limiter, resourcedetection/system, resource, batch]
exporters: `[qryn]``