As a DevOps engineer, I want to set up a new Prometheus and Grafana stack on several Raspberry Pi using Docker containers, so that we can monitor the performance and health of our infrastructure.
I will start by researching the best practices for setting up Prometheus and Grafana on a Raspberry Pi using Docker containers, including hardware requirements and recommended configurations. I will then install Docker on the Raspberry Pi and download the latest Prometheus and Grafana Docker images. Prometheus and Grafana will each run on a dedicated Raspberry Pi node,
Next, I will configure the Docker containers for Prometheus and Grafana, ensuring that the appropriate settings are in place to scrape metrics from our infrastructure and services. I will also set up persistent storage for the metrics, so that we can maintain a historical view of the performance and health of our devices. I will install the Raspberry Pi nodes with Ansible.
Once the containers are up and running, I will set up alerts and notifications to notify us of any critical issues, and create pre-built dashboards to visualize the most important metrics.
After the setup is complete, I will perform thorough testing and validation to make sure the stack is working as expected. I will also document the setup process, including all configurations and scripts, so that future DevOps engineers can easily repeat the process if necessary.
With a new Prometheus and Grafana stack in place on our Raspberry Pi, we will have a centralized, real-time view of the performance and health of our infrastructure, enabling us to quickly identify and resolve any issues before they become a problem.
[x] Change from ubuntu to hostname of choice manually after OS is installed (do not setup ssh keys yet!)
[x] Create user sebastian
[x] Setup ssh key auth from caprica via jarvis
[x] Setup ssh key auth from kobol via jarvis
[x] Update jarvis
[x] Document manual setup steps (if any)
[x] sebastian-sommerfeld-io/configs#12
[x] Run docker-compose stack on RasPi nodes (connect via ssh)
[x] First merge all changes from branch into main
[x] Make sure configs repo is cloned to RasPi node
[x] Setup from scratch as supervisor.fritz.box
[x] create branch "supervisor.fritz.box"
[x] Change from ubuntu to hostname of choice manually after OS is installed (do not setup ssh keys yet!)
[x] Create user sebastian
[x] Setup ssh key auth from caprica via jarvis
[x] Setup ssh key auth from kobol via jarvis
[x] Document manual setup steps (if any)
[x] sebastian-sommerfeld-io/configs#12
[x] Setup Config for Docker Compose, Prometheus, Grafana
[x] Setup supervisor-stack to monitor monitoring.fritz.box
[x] Supervisor collects its own OS-Metrics ... Do not connect to the main monitoring stack. That way I can tear down the
[x] Run docker-compose stack on RasPi nodes (connect via ssh)
[x] First merge all changes from branch into main
[x] Make sure configs repo is cloned to RasPi node supervisor
[x] sebastian-sommerfeld-io/configs#102
[x] Allow execution on RasPi (ARM) and my local dev workstation (x86) without code duplication
[x] No custom Docker image because of ARM.
[x] Use docker compose
[x] Tear down old promteheus RasPi.
[x] Turn docs/modules/ROOT/partials/homelab/workstations/common/raspi-rack.adoc into diagram (ditaa :grey_question:)
[x] Create new Tag + Release: v0.3.0
[x] sebastian-sommerfeld-io/configs#100 and supervisor.fritz.box
[ ] Observe stability ... Best guess: something within the network does not work (DNS??) ... RasPi is not reachable. Flush DNS Cache of FritzBoxTest? Or test completely new hostname (one that never existed on my network). Or assign fix IP on FritzBox???
[x] sebastian-sommerfeld-io/configs#89
[ ] Push nginx error logs to Loki
[ ] Make Prometheus data persistent? To ensure that data is still visible after (container) restart! both stacks!
[ ] Run Prometheus as non-root user (current user from host)
Enhancements
[x] Display PlantUML diagram in generated docs file (src/main/homelab/services-cli-monitoring.sh)
[x] Dependabot for docker-compose - :zap: NOT SUPPORTED
[ ] Send Messages to Google Chat (new Channel) for monitoring and logging events (whatever the thresholds might be)
[ ] Display Github Actions metrics somehow
[ ] Create new Tag + Release: v0.3.1
[ ] Replace docker run --rm mwendler/figlet:latest 'Ansible CLI' and all other occurrences with another image providing the same functionality, but also provides ARM support (don't forget Ansible bashrc config)
[x] Channel requests through an nginx (for both RasPi nodes)
As a DevOps engineer, I want to set up a new Prometheus and Grafana stack on several Raspberry Pi using Docker containers, so that we can monitor the performance and health of our infrastructure.
I will start by researching the best practices for setting up Prometheus and Grafana on a Raspberry Pi using Docker containers, including hardware requirements and recommended configurations. I will then install Docker on the Raspberry Pi and download the latest Prometheus and Grafana Docker images. Prometheus and Grafana will each run on a dedicated Raspberry Pi node,
Next, I will configure the Docker containers for Prometheus and Grafana, ensuring that the appropriate settings are in place to scrape metrics from our infrastructure and services. I will also set up persistent storage for the metrics, so that we can maintain a historical view of the performance and health of our devices. I will install the Raspberry Pi nodes with Ansible.
Once the containers are up and running, I will set up alerts and notifications to notify us of any critical issues, and create pre-built dashboards to visualize the most important metrics.
After the setup is complete, I will perform thorough testing and validation to make sure the stack is working as expected. I will also document the setup process, including all configurations and scripts, so that future DevOps engineers can easily repeat the process if necessary.
With a new Prometheus and Grafana stack in place on our Raspberry Pi, we will have a centralized, real-time view of the performance and health of our infrastructure, enabling us to quickly identify and resolve any issues before they become a problem.
Todos
Run Prometheus + Grafana + Loki (later) on RasPi
monitoring.fritz.box
ubuntu
to hostname of choice manually after OS is installed (do not setup ssh keys yet!)sebastian
main
supervisor.fritz.box
ubuntu
to hostname of choice manually after OS is installed (do not setup ssh keys yet!)sebastian
monitoring.fritz.box
main
docs/modules/ROOT/partials/homelab/workstations/common/raspi-rack.adoc
into diagram (ditaa :grey_question:)supervisor.fritz.box
Enhancements
src/main/homelab/services-cli-monitoring.sh
)docker run --rm mwendler/figlet:latest 'Ansible CLI'
and all other occurrences with another image providing the same functionality, but also provides ARM support (don't forget Ansible bashrc config)