Open ccolvin-ebay opened 6 months ago
:wave: Hello! We are still actively working on the remotecfg
block and have plans to wire up the UI!
From the logs it appears as though the config is being loaded. If you add a prometheus.scrape
and prometheus.remote_write
component to write the metrics, are you able to see metrics flowing through?
Should be something as simple as:
---Config Supplied by API---
prometheus.exporter.windows "os_metrics" {
enabled_collectors = ["cpu","cs","logical_disk","net","os","system"]
}
prometheus.scrape "os_metrics" {
targets = prometheus.exporter.windows.os_metrics.targets
forward_to = [prometheus.remote_write.default.receiver]
}
prometheus.remote_write "default" {
endpoint {
url = <PROMETHEUS REMOTE_WRITE URL
}
}
Hi @spartan0x117,
Thanks for getting back to me. Unfortunately, that does not seem to be working either. I do not get any metrics from this Alloy instance hitting my TSDB, but I am seeing this in the output log:
ts=2024-05-02T17:51:00.9059893Z level=debug msg="Scrape failed" component_path=/remotecfg component_id=prometheus.scrape.os_metrics scrape_pool=remotecfg/prometheus.scrape.os_metrics target=http://alloy.internal:12345/api/v0/component/remotecfg/prometheus.exporter.windows.os_metrics/metrics err="server returned HTTP status 400 Bad Request\ngithub.com/prometheus/prometheus/scrape.(*targetScraper).readResponse\n\t/go/pkg/mod/github.com/grafana/prometheus@v1.8.2-0.20240130142130-51b39f24d406/scrape/scrape.go:848\ngithub.com/prometheus/prometheus/scrape.(*scrapeLoop).scrapeAndReport\n\t/go/pkg/mod/github.com/grafana/prometheus@v1.8.2-0.20240130142130-51b39f24d406/scrape/scrape.go:1386\ngithub.com/prometheus/prometheus/scrape.(*scrapeLoop).run\n\t/go/pkg/mod/github.com/grafana/prometheus@v1.8.2-0.20240130142130-51b39f24d406/scrape/scrape.go:1306\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1695"
At first I thought this meant I couldn't have set up a different listen address set for Alloy. But after changing listen address to default again, I still get this issue.
However, as with the UI and scraping generally, this issue clears up once I move the config to the local file, rather than pulling it from the remote API. I also then start to see the metrics hitting my downstream TSDB.
What's more, when the config for the exporter is set locally, and not pulled remotely, I actually get these messages in the log:
ts=2024-05-02T17:59:25.6120043Z level=debug msg="collector system succeeded after 0.000000s." component_path=/ component_id=prometheus.exporter.windows.os_metrics
ts=2024-05-02T17:59:25.6120043Z level=debug msg="collector net succeeded after 0.000000s." component_path=/ component_id=prometheus.exporter.windows.os_metrics
ts=2024-05-02T17:59:25.6120043Z level=debug msg="collector logical_disk succeeded after 0.000000s." component_path=/ component_id=prometheus.exporter.windows.os_metrics
ts=2024-05-02T17:59:25.6120043Z level=debug msg="collector cs succeeded after 0.000000s." component_path=/ component_id=prometheus.exporter.windows.os_metrics
ts=2024-05-02T17:59:25.6130027Z level=debug msg="collector cpu succeeded after 0.000998s." component_path=/ component_id=prometheus.exporter.windows.os_metrics
ts=2024-05-02T17:59:25.6170077Z level=debug msg="collector os succeeded after 0.005003s." component_path=/ component_id=prometheus.exporter.windows.os_metrics
While I do get a log entry confirming the windows exporter configuration was read and accepted from the API, I do not get any indication that the exporter is actually running. Given that I get clear indication of the exporter running when the config is supplied locally, I believe this suggests that the exporter does not actually start when provided via remote config. It is not just the UI that is affected.
If you look in the data-alloy
directory (I forgot exactly which directory it's under beyond that) there should be a file that's name is a short hash, what are the contents there? This is a cached version of what the collector received from the remote, so just trying to see if there's anything different there.
And could you try making the contents of the pipeline that you're syncing a module which then gets invoked?
It would look something like:
declare "example" {
prometheus.exporter.windows "os_metrics" {
enabled_collectors = ["cpu","cs","logical_disk","net","os","system"]
}
prometheus.scrape "os_metrics" {
targets = prometheus.exporter.windows.os_metrics.targets
forward_to = [prometheus.remote_write.default.receiver]
}
prometheus.remote_write "default" {
endpoint {
url = <PROMETHEUS REMOTE_WRITE URL
}
}
}
example "default" { }
Hi @spartan0x117,
Yes that local cache of the remote config is under data-alloy/remotecfg, with a name that appears to be a hash. I found this early on in troubleshooting and it updates accordingly when the config is updated.
I also tried that custom component idea you suggested, and the behavior is the same, unfortunately. I also set up a blackbox exporter to try something besides the windows exporter, but that also does not work when provided via remote config. However, like the windows exporter, it works fine when supplied via the local config. No config parsing/loading errors in either case.
I've looked through the source, and it seems that loading a remote config and loading a local config both leverage the same methods. I can't see anything that suggests the issue is with the config itself, or how it is loaded. I suspect the issue lies somewhere further downstream, either before or at the point where the components parsed from remote config would be ran.
I can see where the remoteCfgService itself is initialized, and further in where it is run as a RunnableNode (I think), but I don't see the point at which the components the remoteCfgService has loaded get handled/run. Maybe I just haven't found that part yet. I'm very unfamiliar with this code base, but I'm digging.
@spartan0x117,
I've confirmed the same behavior on Ubuntu 22.04 (installed from .deb). Log output confirms that remote configuration has been read, but no indication of the configured collectors actually running afterward.
prometheus.exporter.unix "default" {
include_exporter_metrics = true
disable_collectors = ["mdadm"]
}
journalctl -u=alloy\.service
when RemoteCfg UsedMay 07 12:47:15 <hostname> systemd[1]: Started Vendor-agnostic OpenTelemetry Collector distribution with programmable pipelines.
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230064428Z level=info "boringcrypto enabled"=false
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230095828Z level=info msg="running usage stats reporter"
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230101828Z level=info msg="starting complete graph evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230110428Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc node_id=logging duration=61.7µs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230256728Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc node_id=remotecfg duration=121.3µs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230272928Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc node_id=labelstore duration=3.2µs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230280328Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc node_id=otel duration=400ns
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230314728Z level=info msg="applying non-TLS config to HTTP server" service=http
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230324128Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc node_id=http duration=14µs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230334128Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc node_id=cluster duration=600ns
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230340528Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc node_id=ui duration=500ns
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230348928Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc node_id=tracing duration=2.8µs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230353828Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id="" trace_id=dc4b0e467bcebb2416a6410c27c572bc duration=346.7µs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230362428Z level=debug msg="changing node state" from=viewer to=participant
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230369328Z level=debug msg="<hostname> @1: participant"
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230485928Z level=info msg="scheduling loaded components and services"
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230584928Z level=info msg="starting cluster node" peers="" advertise_addr=127.0.0.1:12345
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230639928Z level=debug msg="<hostname> @3: participant"
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.230820628Z level=info msg="peers changed" new_peers=<hostname>
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.231043328Z level=info msg="now listening for http traffic" service=http addr=127.0.0.1:12345
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236195128Z level=info msg="starting complete graph evaluation" controller_path=/ controller_id=remotecfg trace_id=922b70d769dc990093ab9f72620120a2
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236386728Z level=info msg="Parsed flag --collector.filesystem.mount-points-exclude" component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=filesystem flag=^/(dev|proc|run/credentials/.+|sys|var/lib/docker/.+)($|/)
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236412028Z level=info msg="Parsed flag --collector.filesystem.fs-types-exclude" component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=filesystem flag=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236698028Z level=debug msg="Platform does not support Desktop Management Interface (DMI) information" component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=dmi err="failed to read directory \"/sys/class/dmi/id\": open /sys/class/dmi/id: no such file or directory"
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236735028Z level=info msg="Parsed flag --collector.diskstats.device-exclude" component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=diskstats flag=^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236879328Z level=info msg="Enabled node_exporter collectors" component_path=/remotecfg component_id=prometheus.exporter.unix.default
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236908228Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=arp
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236915528Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=bcache
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236918628Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=bonding
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236921128Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=btrfs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236923428Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=conntrack
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236926128Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=cpu
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236928628Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=cpufreq
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236931228Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=diskstats
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236933728Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=dmi
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236936128Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=edac
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236938628Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=entropy
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236941028Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=fibrechannel
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236943428Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=filefd
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236945828Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=filesystem
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236948228Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=hwmon
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236950728Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=infiniband
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236953428Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=ipvs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236956028Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=loadavg
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236958628Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=meminfo
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236960928Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=netclass
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236963328Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=netdev
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236965828Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=netstat
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236968228Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=nfs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236971628Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=nfsd
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236974128Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=nvme
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236976628Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=os
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236979428Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=powersupplyclass
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236981728Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=pressure
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236984128Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=processes
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236986528Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=rapl
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236989528Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=schedstat
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236992028Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=selinux
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236994528Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=sockstat
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236996928Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=softnet
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.236999328Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=stat
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237001628Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=tapestats
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237004028Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=textfile
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237006428Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=thermal_zone
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237009228Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=time
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237011728Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=timex
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237014228Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=udp_queues
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237016628Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=uname
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237019028Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=vmstat
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237022028Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=xfs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237024328Z level=info component_path=/remotecfg component_id=prometheus.exporter.unix.default collector=zfs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237033428Z level=info msg="finished node evaluation" controller_path=/ controller_id=remotecfg trace_id=922b70d769dc990093ab9f72620120a2 node_id=prometheus.exporter.unix.default duration=749.9µs
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237052428Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id=remotecfg trace_id=922b70d769dc990093ab9f72620120a2 duration=886.1µsremotecfg
May 07 12:47:15 <hostname> alloy[213571]: ts=2024-05-07T18:47:15.237150328Z level=info msg="scheduling loaded components and services"
May 07 12:48:15 <hostname> alloy[213571]: ts=2024-05-07T18:48:15.231040991Z level=debug msg="skipping over API response since it contained the same hash" service=remotecfg
curl localhost:12345/api/v0/component/prometheus.exporter.unix.default/metrics
when RemoteCfg Usedfailed to parse URL path "/api/v0/component/prometheus.exporter.unix.default/metrics": invalid path
journalctl -u=alloy\.service
when Using Local ConfigMay 07 13:00:18 <hostname> systemd[1]: Started Vendor-agnostic OpenTelemetry Collector distribution with programmable pipelines.
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103513674Z level=info "boringcrypto enabled"=false
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103568774Z level=info msg="running usage stats reporter"
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103579274Z level=info msg="starting complete graph evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103588974Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=tracing duration=4.1µs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103593474Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=otel duration=700ns
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103597274Z level=info msg="applying non-TLS config to HTTP server" service=http
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103599774Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=http duration=7.7µs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103602974Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=cluster duration=300ns
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103606274Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=ui duration=200ns
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103609574Z level=debug msg="Platform does not support Desktop Management Interface (DMI) information" component_path=/ component_id=prometheus.exporter.unix.default collector=dmi err="failed to read directory \"/sys/class/dmi/id\": open /sys/class/dmi/id: no such file or directory"
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103615774Z level=info msg="Parsed flag --collector.filesystem.mount-points-exclude" component_path=/ component_id=prometheus.exporter.unix.default collector=filesystem flag=^/(dev|proc|run/credentials/.+|sys|var/lib/docker/.+)($|/)
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103619074Z level=info msg="Parsed flag --collector.filesystem.fs-types-exclude" component_path=/ component_id=prometheus.exporter.unix.default collector=filesystem flag=^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103622974Z level=info msg="Parsed flag --collector.diskstats.device-exclude" component_path=/ component_id=prometheus.exporter.unix.default collector=diskstats flag=^(ram|loop|fd|(h|s|v|xv)d[a-z]|nvme\d+n\d+p)\d+$
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103625874Z level=info msg="Enabled node_exporter collectors" component_path=/ component_id=prometheus.exporter.unix.default
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103628274Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=arp
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103630474Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=bcache
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103634374Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=bonding
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103636574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=btrfs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103638474Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=conntrack
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103640474Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=cpu
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103642574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=cpufreq
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103644974Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=diskstats
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103647074Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=dmi
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103649074Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=edac
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103651374Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=entropy
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103653574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=fibrechannel
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103655674Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=filefd
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103657774Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=filesystem
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103659674Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=hwmon
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103661674Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=infiniband
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103663674Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=ipvs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103665774Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=loadavg
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103667874Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=meminfo
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103669874Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=netclass
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103671774Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=netdev
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103673874Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=netstat
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103675774Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=nfs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103677874Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=nfsd
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103679974Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=nvme
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103681974Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=os
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103683974Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=powersupplyclass
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103685974Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=pressure
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103688674Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=processes
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103690874Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=rapl
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103692774Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=schedstat
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103694774Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=selinux
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103697274Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=sockstat
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103699574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=softnet
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103701574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=stat
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103703574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=tapestats
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103705574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=textfile
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103707574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=thermal_zone
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103709974Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=time
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103711974Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=timex
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103714074Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=udp_queues
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103716574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=uname
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103718574Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=vmstat
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103720674Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=xfs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103722674Z level=info component_path=/ component_id=prometheus.exporter.unix.default collector=zfs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103724774Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=prometheus.exporter.unix.default duration=529.7µs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103730474Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=logging duration=293.3µs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103761974Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=remotecfg duration=24.7µs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103772274Z level=info msg="finished node evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a node_id=labelstore duration=3.3µs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103777274Z level=info msg="finished complete graph evaluation" controller_path=/ controller_id="" trace_id=9ae051de75252814392baf0afc8e247a duration=951.3µs
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103804174Z level=debug msg="changing node state" from=viewer to=participant
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103815774Z level=debug msg="<hostname> @1: participant"
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.103949074Z level=info msg="scheduling loaded components and services"
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.104091874Z level=info msg="starting cluster node" peers="" advertise_addr=127.0.0.1:12345
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.104161274Z level=debug msg="<hostname> @3: participant"
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.104310174Z level=info msg="now listening for http traffic" service=http addr=127.0.0.1:12345
May 07 13:00:18 <hostname> alloy[213686]: ts=2024-05-07T19:00:18.104308574Z level=info msg="peers changed" new_peers=<hostname>
curl localhost:12345/api/v0/component/prometheus.exporter.unix.default/metrics
when Using Local Config# HELP node_arp_entries ARP entries by device
# TYPE node_arp_entries gauge
node_arp_entries{device="eth0"} 1
# HELP node_boot_time_seconds Node boot time, in unixtime.
# TYPE node_boot_time_seconds gauge
node_boot_time_seconds 1.714411763e+09
# HELP node_context_switches_total Total number of context switches.
# TYPE node_context_switches_total counter
node_context_switches_total 8.9743154e+07
# HELP node_cooling_device_cur_state Current throttle state of the cooling device
# TYPE node_cooling_device_cur_state gauge
node_cooling_device_cur_state{name="0",type="Processor"} 0
node_cooling_device_cur_state{name="1",type="Processor"} 0
node_cooling_device_cur_state{name="10",type="Processor"} 0
node_cooling_device_cur_state{name="11",type="Processor"} 0
node_cooling_device_cur_state{name="12",type="Processor"} 0
node_cooling_device_cur_state{name="13",type="Processor"} 0
node_cooling_device_cur_state{name="14",type="Processor"} 0
node_cooling_device_cur_state{name="15",type="Processor"} 0
node_cooling_device_cur_state{name="2",type="Processor"} 0
node_cooling_device_cur_state{name="3",type="Processor"} 0
node_cooling_device_cur_state{name="4",type="Processor"} 0
node_cooling_device_cur_state{name="5",type="Processor"} 0
node_cooling_device_cur_state{name="6",type="Processor"} 0
node_cooling_device_cur_state{name="7",type="Processor"} 0
node_cooling_device_cur_state{name="8",type="Processor"} 0
node_cooling_device_cur_state{name="9",type="Processor"} 0
# HELP node_cooling_device_max_state Maximum throttle state of the cooling device
# TYPE node_cooling_device_max_state gauge
node_cooling_device_max_state{name="0",type="Processor"} 0
node_cooling_device_max_state{name="1",type="Processor"} 0
node_cooling_device_max_state{name="10",type="Processor"} 0
node_cooling_device_max_state{name="11",type="Processor"} 0
node_cooling_device_max_state{name="12",type="Processor"} 0
node_cooling_device_max_state{name="13",type="Processor"} 0
node_cooling_device_max_state{name="14",type="Processor"} 0
node_cooling_device_max_state{name="15",type="Processor"} 0
node_cooling_device_max_state{name="2",type="Processor"} 0
node_cooling_device_max_state{name="3",type="Processor"} 0
node_cooling_device_max_state{name="4",type="Processor"} 0
node_cooling_device_max_state{name="5",type="Processor"} 0
node_cooling_device_max_state{name="6",type="Processor"} 0
node_cooling_device_max_state{name="7",type="Processor"} 0
node_cooling_device_max_state{name="8",type="Processor"} 0
node_cooling_device_max_state{name="9",type="Processor"} 0
# HELP node_cpu_seconds_total Seconds the CPUs spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 695633.92
node_cpu_seconds_total{cpu="0",mode="iowait"} 3.53
node_cpu_seconds_total{cpu="0",mode="irq"} 0
node_cpu_seconds_total{cpu="0",mode="nice"} 0
node_cpu_seconds_total{cpu="0",mode="softirq"} 78.77
node_cpu_seconds_total{cpu="0",mode="steal"} 0
node_cpu_seconds_total{cpu="0",mode="system"} 201.96
node_cpu_seconds_total{cpu="0",mode="user"} 90.4
node_cpu_seconds_total{cpu="1",mode="idle"} 695454.25
node_cpu_seconds_total{cpu="1",mode="iowait"} 2.85
node_cpu_seconds_total{cpu="1",mode="irq"} 0
node_cpu_seconds_total{cpu="1",mode="nice"} 1.26
node_cpu_seconds_total{cpu="1",mode="softirq"} 12.27
node_cpu_seconds_total{cpu="1",mode="steal"} 0
node_cpu_seconds_total{cpu="1",mode="system"} 310.31
node_cpu_seconds_total{cpu="1",mode="user"} 199.49
node_cpu_seconds_total{cpu="10",mode="idle"} 695544.2
node_cpu_seconds_total{cpu="10",mode="iowait"} 3.35
node_cpu_seconds_total{cpu="10",mode="irq"} 0
node_cpu_seconds_total{cpu="10",mode="nice"} 0.65
node_cpu_seconds_total{cpu="10",mode="softirq"} 0.3
...
As this seems to be behavior beyond that of just the Windows build, I've changed the name of the issue accordingly. Please advise on next steps when able. I'm still digging through the source myself to see if I can identify the issue. However, I may be stuck needing to write a wrapper around Alloy to handle remote configuration for my use case, as that may be faster, frankly.
Is there a practical example of this working somewhere out there?
Have you looked at import.git or import.http?
And thank you for the highly detailed outputs, that's extremely helpful! :smile:
Hope it's helpful. Though I'd feel more helpful if I could point to where it is broken, not just that it is broken.
As for import.git and import.http, I have looked at them. Git is a non-starter for us, as the agents will be spread across many zones, both trusted and untrusted. Import.http would work if it had the ability to secure that connection with TLS, which does not appear to be an option for that block.
Alloy is pretty much Grafana Agent, rebranded and with Static mode stripped out, right? If that is correct, then maybe I should try testing with Flow. If it works with Flow, then perhaps the issue arose during some of the renaming work prior to Alloy's release. I'm grasping at straws here, admittedly.
Given time constraints I'm under, I've written a wrapper for Alloy that starts Alloy proper as a child process, handles communication with the remote config API, then writes that config to Alloy's main config file. Around that, it restarts the Alloy process as needed. This solves my immediate problem, and also provides some unique benefits above what remote config offers.
I may still discard this wrapper whenever remote config is fully operational again, but until then I need something that can move me forward. Regardless, thank you all for what you do.
Sorry for the delay, but I was surprised that import.http
didn't support configuring TLS so did a bit more digging and found that it looks to be missing from the documentation! I've opened a PR to fix it. In the code, the Client is present in the import.http's Arguments struct and set in the call to remote_http.New(...), however.
For the time being, import.http
embeds a remote.http
component so the configuration of the client is the same. You can refer to the remote.http component docs to see how it's configured, but it should look something like:
import.http "os_metrics" {
url = REMOTE_CONFIG_URL
client {
tls_config {
ca_file = CA_FILE // or ca_pem if you want to inline it
cert_file = CERT_FILE // or cert_pem if you want to inline it
key_file = KEY_FILE // or key_pem if you want to inline it
}
}
}
That is good to know. I'll take a look to see if that will suffice for my use case. Thanks for following up!
Given time constraints I'm under, I've written a wrapper for Alloy that starts Alloy proper as a child process, handles communication with the remote config API, then writes that config to Alloy's main config file. Around that, it restarts the Alloy process as needed. This solves my immediate problem, and also provides some unique benefits above what remote config offers.
I may still discard this wrapper whenever remote config is fully operational again, but until then I need something that can move me forward. Regardless, thank you all for what you do.
Hi, can you give me more details how did you set up the remote config server? Did you try to use alloy remote config server code on Github?
Hi, can you give me more details how did you set up the remote config server? Did you try to use alloy remote config server code on Github?
@corgilovetea: I did use the https://github.com/grafana/alloy-remote-config/ project. But I guess I don't understand the question, in this context. The API server works fine, clients using the same generated protobuf methods and types interact with it just fine. The server does not seem to be the issue.
Is this question related to my issue, or are you looking for assistance in getting the remote config server working?
Hi @ccolvin-ebay, tks for your reply. I need an assistance in getting the remote config server working. Can you describe a little bit how did - Make the changes to the relevant /api//.proto files to map with your local alloy config file?
@corgilovetea The ConnectRPC docs on creating a handler in Go using generated connect service definitions may be helpful :smiley:
@corgilovetea, no worries. It is confusing for those not familiar with protobuf and ConnectRPC (I was not familiar with it either when I started looking into it.)
You don't need to modify the generated files at all. In fact, at least as of my last usage of them, you don't even need to generate them yourself. Simply include these three lines in your import block:
"connectrpc.com/connect"
agentv1 "github.com/grafana/agent-remote-config/api/gen/proto/go/agent/v1"
agentv1connect "github.com/grafana/agent-remote-config/api/gen/proto/go/agent/v1/agentv1connect"
Make sure to run a go mod tidy
, naturally. There are already generated files committed to that repo, so all we're doing here is importing them directly from github as modules, and using the types and methods from them.
Then to set up the basic server, you need to define a type that conforms to the AgentServiceHandler interface from agentv1connect. This is just a struct that implements the same methods that the interface defines. This is fed into a call to agentv1connect.NewAgentServiceHandler(), to generate an http.Handler that you can use with whatever http server you choose.
Here is the main.go file from the minimal example I set up to explore this use case. Feel free to use it as a reference. It is not something that is set up for production use cases, but is a functional base implementation of a server that provides a config when requested from an Alloy instance.
However, as my issue states, remote config is not entirely functional currently. As @spartan0x117 mentions before, it would likely be better to use import.git or import.http in the short term. These should also be much simpler to set up, if your use case allows for them.
This issue has not had any activity in the past 30 days, so the needs-attention
label has been added to it.
If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue.
The needs-attention
label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!
Did you make any progress on this look like windows exporter not working on remote cfg
What's wrong?
I'm trying to implement a small end-to-end POC for using Alloy with the remotecfg block. To this end, I've set up a minimal example of an Alloy Remote Config API that, for the moment, just pulls a config from the file system and serves it to any agent that requests a configuration. The API part works fine, but for reference here is the GetConfig implementation:
Get Config Implementation
I can see in the log output that the config is read correctly and seems to have taken, but then the agent does not behave as if that new config is present. This is best shown in these snips from the UI:
If I manually place the same config block into the local starting config file for Alloy, it works just fine:
Reloading Alloy has no effect on the outcome. It appears that there must be some issue preventing the remote config received from applying correctly. I have not yet tested this on Linux, but will today. Will update if the issue is present there as well.
Steps to reproduce
System information
Windows Server 2016 Standalone x86_64
Software version
Grafana Alloy v1.0.0
Configuration
Logs