Open KokoTa opened 1 year ago
Can you share the config you used for binary? Is it just failing to scrape one metric? or are all metrics not working?
@srikanthccv I use curl -sL https://github.com/SigNoz/benchmark/raw/main/dashboards/hostmetrics/hostmetrics-import.sh | bash
from chapter above.
machine with signoz have the dashbaord metric, but other machine all fail.
I test vm server and cloud server, all get metric fail.
Maybe problem from here: https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/18232 ?
I use new version opentelemetry-collector-contrib
also get this result.
While the issue is related, it wouldn't make the other scrapers not work. You should still be able to see metrics for remaining scrapers except for the process
.
@srikanthccv I upgrate signoz and opentelemetry-collector-contrib, now see like this:
(machine is using Plain Binary wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.79.0/otelcol-contrib_0.79.0_linux_amd64.tar.gz
)
2023-08-08T18:34:53.464-0700 info service/telemetry.go:104 Setting up own telemetry...
2023-08-08T18:34:53.467-0700 info service/telemetry.go:127 Serving Prometheus metrics {"address": "0.0.0.0:8888", "level": "Basic"}
2023-08-08T18:34:53.472-0700 info service/service.go:131 Starting otelcol-contrib... {"Version": "0.79.0", "NumCPU": 8}
2023-08-08T18:34:53.472-0700 info extensions/extensions.go:30 Starting extensions...
2023-08-08T18:34:53.472-0700 info extensions/extensions.go:33 Extension is starting... {"kind": "extension", "name": "health_check"}
2023-08-08T18:34:53.472-0700 info healthcheckextension@v0.79.0/healthcheckextension.go:34 Starting health_check extension {"kind": "extension", "name": "health_check", "config": {"Endpoint":"0.0.0.0:13133","TLSSetting":null,"CORS":null,"Auth":null,"MaxRequestBodySize":0,"IncludeMetadata":false,"Path":"/","ResponseBody":null,"CheckCollectorPipeline":{"Enabled":false,"Interval":"5m","ExporterFailureThreshold":5}}}
2023-08-08T18:34:53.472-0700 warn internal/warning.go:40 Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks {"kind": "extension", "name": "health_check", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2023-08-08T18:34:53.472-0700 info extensions/extensions.go:37 Extension started. {"kind": "extension", "name": "health_check"}
2023-08-08T18:34:53.472-0700 info extensions/extensions.go:33 Extension is starting... {"kind": "extension", "name": "zpages"}
2023-08-08T18:34:53.472-0700 info zpagesextension@v0.79.0/zpagesextension.go:53 Registered zPages span processor on tracer provider {"kind": "extension", "name": "zpages"}
2023-08-08T18:34:53.472-0700 info zpagesextension@v0.79.0/zpagesextension.go:63 Registered Host's zPages {"kind": "extension", "name": "zpages"}
2023-08-08T18:34:53.473-0700 info zpagesextension@v0.79.0/zpagesextension.go:75 Starting zPages extension {"kind": "extension", "name": "zpages", "config": {"TCPAddr":{"Endpoint":"localhost:55679"}}}
2023-08-08T18:34:53.473-0700 info extensions/extensions.go:37 Extension started. {"kind": "extension", "name": "zpages"}
2023-08-08T18:34:53.475-0700 warn internal/warning.go:40 Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks {"kind": "receiver", "name": "otlp", "data_type": "traces", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2023-08-08T18:34:53.475-0700 info otlpreceiver@v0.79.0/otlp.go:83 Starting GRPC server {"kind": "receiver", "name": "otlp", "data_type": "traces", "endpoint": "0.0.0.0:4317"}
2023-08-08T18:34:53.475-0700 warn internal/warning.go:40 Using the 0.0.0.0 address exposes this server to every network interface, which may facilitate Denial of Service attacks {"kind": "receiver", "name": "otlp", "data_type": "traces", "documentation": "https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks"}
2023-08-08T18:34:53.475-0700 info otlpreceiver@v0.79.0/otlp.go:101 Starting HTTP server {"kind": "receiver", "name": "otlp", "data_type": "traces", "endpoint": "0.0.0.0:4318"}
2023-08-08T18:34:53.475-0700 info internal/resourcedetection.go:125 began detecting resource information {"kind": "processor", "name": "resourcedetection", "pipeline": "metrics/internal"}
2023-08-08T18:34:53.475-0700 info internal/resourcedetection.go:139 detected resource information {"kind": "processor", "name": "resourcedetection", "pipeline": "metrics/internal", "resource": {"host.id":"9a32b987173a410b9dfbed9aa2746f2a","host.name":"192.168.2.102","os.type":"linux"}}
2023-08-08T18:34:53.476-0700 info prometheusreceiver@v0.79.0/metrics_receiver.go:242 Starting discovery manager {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
2023-08-08T18:34:53.841-0700 info prometheusreceiver@v0.79.0/metrics_receiver.go:233 Scrape job added {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "jobName": "otel-collector-binary"}
2023-08-08T18:34:53.842-0700 info prometheusreceiver@v0.79.0/metrics_receiver.go:281 Starting scrape manager {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
2023-08-08T18:34:53.843-0700 info healthcheck/handler.go:129 Health Check state change {"kind": "extension", "name": "health_check", "status": "ready"}
2023-08-08T18:34:53.843-0700 info service/service.go:148 Everything is ready. Begin running and processing data.
2023-08-08T18:34:55.135-0700 error scraperhelper/scrapercontroller.go:213 Error scraping metrics {"kind": "receiver", "name": "hostmetrics", "data_type": "metrics", "error": "error reading parent pid for process \"systemd\" (pid 1): invalid pid 0; error reading process executable for pid 2: readlink /proc/2/exe: no such file or directory; error reading parent pid for process \"kthreadd\" (pid 2): invalid pid 0; error reading process executable for pid 4: readlink /proc/4/exe: no such file or directory; error reading process executable for pid 6: readlink /proc/6/exe: no such file or directory; error reading process executable for pid 7: readlink /proc/7/exe: no such file or directory; error reading process executable for pid 8: readlink /proc/8/exe: no such file or directory; error reading process executable for pid 9: readlink /proc/9/exe: no such file or directory; error reading process executable for pid 10: readlink /proc/10/exe: no such file or directory; error reading process executable for pid 11: readlink /proc/11/exe: no such file or directory; error reading process executable for pid 12: readlink /proc/12/exe: no such file or directory; error reading process executable for pid 13: readlink /proc/13/exe: no such file or directory; error reading process executable for pid 14: readlink /proc/14/exe: no such file or directory; error reading process executable for pid 16: readlink /proc/16/exe: no such file or directory; error reading process executable for pid 17: readlink /proc/17/exe: no such file or directory; error reading process executable for pid 18: readlink /proc/18/exe: no such file or directory; error reading process executable for pid 19: readlink /proc/19/exe: no such file or directory; error reading process executable for pid 21: readlink /proc/21/exe: no such file or directory; error reading process executable for pid 22: readlink /proc/22/exe: no such file or directory; error reading process executable for pid 23: readlink /proc/23/exe: no such file or directory; error reading process executable for pid 24: readlink /proc/24/exe: no such file or directory; error reading process executable for pid 26: readlink /proc/26/exe: no such file or directory; error reading process executable for pid 27: readlink /proc/27/exe: no such file or directory; error reading process executable for pid 28: readlink /proc/28/exe: no such file or directory; error reading process executable for pid 29: readlink /proc/29/exe: no such file or directory; error reading process executable for pid 31: readlink /proc/31/exe: no such file or directory; error reading process executable for pid 32: readlink /proc/32/exe: no such file or directory; error reading process executable for pid 33: readlink /proc/33/exe: no such file or directory; error reading process executable for pid 34: readlink /proc/34/exe: no such file or directory; error reading process executable for pid 36: readlink /proc/36/exe: no such file or directory; error reading process executable for pid 37: readlink /proc/37/exe: no such file or directory; error reading process executable for pid 38: readlink /proc/38/exe: no such file or directory; error reading process executable for pid 39: readlink /proc/39/exe: no such file or directory; error reading process executable for pid 41: readlink /proc/41/exe: no such file or directory; error reading process executable for pid 42: readlink /proc/42/exe: no such file or directory; error reading process executable for pid 43: readlink /proc/43/exe: no such file or directory; error reading process executable for pid 44: readlink /proc/44/exe: no such file or directory; error reading process executable for pid 46: readlink /proc/46/exe: no such file or directory; error reading process executable for pid 48: readlink /proc/48/exe: no such file or directory; error reading process executable for pid 49: readlink /proc/49/exe: no such file or directory; error reading process executable for pid 50: readlink /proc/50/exe: no such file or directory; error reading process executable for pid 51: readlink /proc/51/exe: no such file or directory; error reading process executable for pid 52: readlink /proc/52/exe: no such file or directory; error reading process executable for pid 53: readlink /proc/53/exe: no such file or directory; error reading process executable for pid 54: readlink /proc/54/exe: no such file or directory; error reading process executable for pid 55: readlink /proc/55/exe: no such file or directory; error reading process executable for pid 56: readlink /proc/56/exe: no such file or directory; error reading process executable for pid 57: readlink /proc/57/exe: no such file or directory; error reading process executable for pid 58: readlink /proc/58/exe: no such file or directory; error reading process executable for pid 59: readlink /proc/59/exe: no such file or directory; error reading process executable for pid 65: readlink /proc/65/exe: no such file or directory; error reading process executable for pid 66: readlink /proc/66/exe: no such file or directory; error reading process executable for pid 67: readlink /proc/67/exe: no such file or directory; error reading process executable for pid 68: readlink /proc/68/exe: no such file or directory; error reading process executable for pid 76: readlink /proc/76/exe: no such file or directory; error reading process executable for pid 78: readlink /proc/78/exe: no such file or directory; error reading process executable for pid 79: readlink /proc/79/exe: no such file or directory; error reading process executable for pid 81: readlink /proc/81/exe: no such file or directory; error reading process executable for pid 83: readlink /proc/83/exe: no such file or directory; error reading process executable for pid 97: readlink /proc/97/exe: no such file or directory; error reading process executable for pid 133: readlink /proc/133/exe: no such file or directory; error reading process executable for pid 295: readlink /proc/295/exe: no such file or directory; error reading process executable for pid 296: readlink /proc/296/exe: no such file or directory; error reading process executable for pid 297: readlink /proc/297/exe: no such file or directory; error reading process executable for pid 300: readlink /proc/300/exe: no such file or directory; error reading process executable for pid 306: readlink /proc/306/exe: no such file or directory; error reading process executable for pid 307: readlink /proc/307/exe: no such file or directory; error reading process executable for pid 310: readlink /proc/310/exe: no such file or directory; error reading process executable for pid 311: readlink /proc/311/exe: no such file or directory; error reading process executable for pid 312: readlink /proc/312/exe: no such file or directory; error reading process executable for pid 313: readlink /proc/313/exe: no such file or directory; error reading process executable for pid 316: readlink /proc/316/exe: no such file or directory; error reading process executable for pid 317: readlink /proc/317/exe: no such file or directory; error reading process executable for pid 331: readlink /proc/331/exe: no such file or directory; error reading process executable for pid 344: readlink /proc/344/exe: no such file or directory; error reading process executable for pid 345: readlink /proc/345/exe: no such file or directory; error reading process executable for pid 346: readlink /proc/346/exe: no such file or directory; error reading process executable for pid 347: readlink /proc/347/exe: no such file or directory; error reading process executable for pid 348: readlink /proc/348/exe: no such file or directory; error reading process executable for pid 349: readlink /proc/349/exe: no such file or directory; error reading process executable for pid 350: readlink /proc/350/exe: no such file or directory; error reading process executable for pid 351: readlink /proc/351/exe: no such file or directory; error reading process executable for pid 352: readlink /proc/352/exe: no such file or directory; error reading process executable for pid 353: readlink /proc/353/exe: no such file or directory; error reading process executable for pid 354: readlink /proc/354/exe: no such file or directory; error reading process executable for pid 355: readlink /proc/355/exe: no such file or directory; error reading process executable for pid 439: readlink /proc/439/exe: no such file or directory; error reading process executable for pid 563: readlink /proc/563/exe: no such file or directory; error reading process executable for pid 594: readlink /proc/594/exe: no such file or directory; error reading process executable for pid 595: readlink /proc/595/exe: no such file or directory; error reading process executable for pid 596: readlink /proc/596/exe: no such file or directory; error reading process executable for pid 598: readlink /proc/598/exe: no such file or directory; error reading process executable for pid 599: readlink /proc/599/exe: no such file or directory; error reading process executable for pid 600: readlink /proc/600/exe: no such file or directory; error reading process executable for pid 601: readlink /proc/601/exe: no such file or directory; error reading process executable for pid 603: readlink /proc/603/exe: no such file or directory; error reading process executable for pid 648: readlink /proc/648/exe: no such file or directory; error reading process executable for pid 649: readlink /proc/649/exe: no such file or directory; error reading process executable for pid 812: readlink /proc/812/exe: no such file or directory; error reading process executable for pid 871: readlink /proc/871/exe: no such file or directory; error reading process executable for pid 1087: readlink /proc/1087/exe: no such file or directory; error reading process executable for pid 1950: readlink /proc/1950/exe: no such file or directory; error reading process executable for pid 18015: readlink /proc/18015/exe: no such file or directory; error reading process executable for pid 20425: readlink /proc/20425/exe: no such file or directory; error reading process executable for pid 20591: readlink /proc/20591/exe: no such file or directory; error reading process executable for pid 27585: readlink /proc/27585/exe: no such file or directory; error reading process executable for pid 27626: readlink /proc/27626/exe: no such file or directory; error reading process executable for pid 27743: readlink /proc/27743/exe: no such file or directory; error reading process executable for pid 28064: readlink /proc/28064/exe: no such file or directory; error reading process executable for pid 58704: readlink /proc/58704/exe: no such file or directory; error reading process executable for pid 99701: readlink /proc/99701/exe: no such file or directory; error reading process executable for pid 101271: readlink /proc/101271/exe: no such file or directory; error reading process executable for pid 103449: readlink /proc/103449/exe: no such file or directory; error reading process executable for pid 113258: readlink /proc/113258/exe: no such file or directory; error reading process executable for pid 117018: readlink /proc/117018/exe: no such file or directory; error reading process executable for pid 124484: readlink /proc/124484/exe: no such file or directory; error reading process executable for pid 124653: readlink /proc/124653/exe: no such file or directory; error reading process executable for pid 127679: readlink /proc/127679/exe: no such file or directory; error reading process executable for pid 128054: readlink /proc/128054/exe: no such file or directory; error reading process executable for pid 128305: readlink /proc/128305/exe: no such file or directory; error reading process executable for pid 128631: readlink /proc/128631/exe: no such file or directory; error reading process executable for pid 129132: readlink /proc/129132/exe: no such file or directory; error reading process executable for pid 129290: readlink /proc/129290/exe: no such file or directory", "scraper": "process"}
go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).scrapeMetricsAndReport
go.opentelemetry.io/collector/receiver@v0.79.0/scraperhelper/scrapercontroller.go:213
go.opentelemetry.io/collector/receiver/scraperhelper.(*controller).startScraping.func1
go.opentelemetry.io/collector/receiver@v0.79.0/scraperhelper/scrapercontroller.go:188
There should be "No Data" if the host didn't send any data. Are these two machines of the same kind?
@srikanthccv
Two machines are same, this is machine info:
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"
I am not sure what's the issue. Just to rephrase your setup based on my understanding, you have two machines, and one of them runs SigNoz deployment. The machine which runs SigNoz has its host metrics working. The other machine uses the binary and has a pipeline that exports data to SigNoz using the OTLP exporter. You expect other machine host metrics to work, but they are now working. There are two broad things 1. Your second machine is not sending any data at all. 2. It sends data, but the dashboard is not working (which is less likely). Can you confirm if the other machine is sending data? Can you check if you don't see any errors in the console?
@srikanthccv Yes, i think the second machine is not send data after my test. I see above shell log, the collector has error error reading parent pid for process \"systemd\" (pid 1): invalid pid 0
, and signoz no receive any data.Maybe the error cause data send fail?
To my knowledge, that shouldn't be the case because the issue is coming from the process scraper, but the rest of the scrapers, such as cpu
, memory
etc... should all work. You could also be getting the same error in the SigNoz deployment machine.
Please share the collector contrib config on the second machine.
@srikanthccv
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
hostmetrics:
collection_interval: 10s
scrapers:
cpu: {}
disk: {}
load: {}
filesystem: {}
memory: {}
network: {}
paging: {}
process:
mute_process_name_error: true
processes: {}
prometheus:
config:
global:
scrape_interval: 10s
scrape_configs:
- job_name: otel-collector-binary
static_configs:
- targets: ['localhost:8888']
processors:
batch:
send_batch_size: 1000
timeout: 10s
# Ref: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/resourcedetectionprocessor/README.md
resourcedetection:
detectors: [env, system] # include ec2 for AWS, gcp for GCP and azure for Azure.
# Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
timeout: 2s
system:
hostname_sources: [os] # alternatively, use [dns,os] for setting FQDN as host.name and os as fallback
extensions:
health_check: {}
zpages: {}
exporters:
otlp:
endpoint: 192.168.2.101:4317
tls:
insecure: true
logging:
# verbosity of the logging export: detailed, normal, basic
verbosity: normal
service:
telemetry:
metrics:
address: 0.0.0.0:8888
extensions: [health_check, zpages]
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [otlp]
metrics/internal:
receivers: [prometheus, hostmetrics]
processors: [resourcedetection, batch]
exporters: [otlp]
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp]
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlp]
I just change endpoint
attribute.
I will have to test this on a real machine with the same config to say anything more about it.
@srikanthccv Thanks for your help^ ^
Bug description
I have two machine, IP 192.168.2.102 and 192.168.2.101
Signoz install in 192.168.2.101 and dashboard show successful:
But follow 《OpenTelemetry Binary Usage in Virtual Machine》 chapter to fetch 192.168.2.102 is fail.
It show like:
Expected behavior
Fetch diffrent machine hostmetrics
How to reproduce
https://signoz.io/docs/tutorial/opentelemetry-binary-usage-in-virtual-machine/#plain-binary
Version information
Additional context
Thank you for your bug report – we love squashing them!