aws-samples / aws-parallelcluster-monitoring

Monitoring Dashboard for AWS ParallelCluster
MIT No Attribution
31 stars 23 forks source link

Prometheus fails to start #25

Closed sean-smith closed 1 year ago

sean-smith commented 1 year ago

I noticed the prometheus container was hitting an issue:

[ec2-user@ip-172-31-26-12 ~]$ docker ps
CONTAINER ID   IMAGE                              COMMAND                  CREATED          STATUS                          PORTS     NAMES
0cc5f7d27c63   prom/prometheus                    "/bin/prometheus --c…"   12 minutes ago   Restarting (2) 31 seconds ago             prometheus
8b26a21de381   grafana/grafana                    "/run.sh"                12 minutes ago   Up 12 minutes                             grafana
7740aa93faf7   quay.io/prometheus/node-exporter   "/bin/node_exporter …"   12 minutes ago   Up 12 minutes                             node-exporter
19248efd1648   nginx                              "/docker-entrypoint.…"   12 minutes ago   Up 12 minutes                             nginx
f3d3c110005d   prom/pushgateway                   "/bin/pushgateway"       12 minutes ago   Up 12 minutes                             pushgateway
You have new mail in /var/spool/mail/ec2-user

The container shows:

[ec2-user@ip-172-31-26-12 ~]$ docker logs 0cc5f7d27c63
ts=2023-04-12T06:34:40.244Z caller=main.go:468 level=error msg="Error loading config (--config.file=/etc/prometheus/prometheus.yml)" file=/etc/prometheus/prometheus.yml err="parsing YAML file /etc/prometheus/prometheus.yml: EC2 SD configuration requires a region"

Looks like prometheus.yaml needs the region.

sean-smith commented 1 year ago

See https://github.com/aws-samples/aws-parallelcluster-monitoring/pull/26