berthubert / simplomon

Very simple monitoring system with a single configuration file
MIT License
97 stars 12 forks source link

simplomon

Very simple availability monitoring system with a single configuration file and a single binary. Also comes as a Docker image.

Key differences compared to existing systems:

You'd use this if you think "I need to slap some monitoring on this pronto and I can't be bothered to setup something difficult that will require maintenance or get hacked". Another usecase if you want to monitor from a vantage point where you can't install large-scale software.

You'd also use this if you appreciate some of the 'smarter' checks described above.

If you want a full featured complicated monitoring system, there is lots of choice already, and this isn't it. Also, as it stands simplomon won't scale to tens of thousands of checks.

If you miss features that just make sense, do let me know! Open an issue here please.

Important news

If you upgraded on the 12th of April 2024 and were using the SQLite logger, you need to change Logger("db.sqlite") to Logger{filename="db.sqlite"}.

Mission statement

Right now Simplomon is simple and robust. History shows it is a struggle to keep a project lean, but we're going to give it a try. Simple to use however does not imply a lack of powerful features. But it does mean:

As an example, more generic monitoring/logging solutions can also monitor DNS records through regular expressions that have to be just so. In simplomon, we'd rather have a test that out of the box checks common things really well. In other words, we want to break out custom and dynamic rules only for very rare or niche tests.

Thanks to

Many thanks are due to early adopters & contributors:

Sample configuration (without Docker)

Note that the configuration below is completely functional, you need nothing else, except a working Pushover, ntfy or Telegram account, or a mailbox. If you need another notifier, do let me know.

addPushoverNotifier{user="copy this in from pushover config",
        apikey="copy this in from pushover config"}

-- or ntfy.sh:
-- addNtfyNotifier{topic="your_secret_topic"}

-- or point to your own instance, or set the Authorization header through 'auth'
-- addNtfyNotifier{topic="your_topic", url="https://ntfy.example.net", auth="Basic dGVzdHVzZXI6ZmFrZXBhc3N3b3Jk"}

-- or create a Telegram bot, and direct messages to a chat both you and the bot have access to.
-- addTelegramNotifier{bot_id="your bot id",
                       apikey="your api key",
                       chat_id="the chat id"}

-- or email, in "CEO mode": only gets alerts that have been there for an hour
-- addEmailNotifier{from="bert@example.com", to="bert@example.com",
-- server="10.0.0.2", minMinutes = 60}

Pushover appears to work really well, and I'd prefer it to ntfy. Email meanwhile is a bit scary, since it might need the very infrastructure it monitors to send out notifications. You might never get that email.

Here are some sample checkers:

dailyChime{utcHour=10} -- 10AM UTC chime confirms monitoring works

ping{servers={"9.9.9.9", "8.8.8.8"}} -- does our network even work

-- the following checks certificates, and whines if any expire within
-- two weeks. If we have IPv6, and there are AAAA records, check that too
https{url="https://berthub.eu"}

-- save bandwidth, don't fetch the body
https{url="https://galmon.eu/", method="HEAD"}

-- This complains if that URL is older than 20 minutes
https{url="https://berthub.eu/nlelec/dutch-stack.svg", maxAgeMinutes=20}

-- check if the response body contains a regex match
https{url="https://example.org", regex="[Ee]xample [Dd]omain"}

-- check if a specific server IP is serving correctly
https{url="https://berthub.eu", serverIP="86.82.68.237"}
https{url="https://berthub.eu", serverIP="2001:41f0:782d::2"}

-- check if a URL works correctly if resolved via specific nameserver
https{url="https://berthub.eu", dns={"8.8.8.8"}}
-- or from a specific source IP
https{url="https://berthub.eu", dns={"9.9.9.9"], localIP="10.0.0.9"}

-- Check if SOA records are identical
nameservers={"100.25.31.6", "86.82.68.237", "217.100.190.174"}
dnssoa{domain="berthub.eu", servers= nameservers}
dnssoa{domain="hubertnet.nl", servers= nameservers}

-- DNSSEC, check if signatures are fresh enough
rrsig{server="45.55.10.200", name="powerdns.com"}
rrsig{server="188.166.104.87", name="powerdns.com"}
rrsig{server="149.20.2.26", name="isc.org", minDays=10}
rrsig{server="100.25.31.6", name="berthub.eu"}

-- Check if the following ports are closed
scaryports={25, 80, 110, 443, 3000, 3306, 5000, 5432, 8000, 8080, 8888}
tcpportclosed{servers={"100.25.31.6"}, ports=scaryports}

-- Check if DNS is serving what it should be
dns{server="100.25.31.6", name="berthub.eu", type="A",
    acceptable={"86.82.68.237", "217.100.190.174"}}
dns{server="100.25.31.6", name="berthub.eu", type="AAAA",
    acceptable={"2001:41f0:782d::2"}}

-- Does the http redirect work?
httpredir{fromUrl="http://berthub.eu", toUrl="https://berthub.eu/"}

-- And the www redirects?
httpredir{fromUrl="http://www.berthub.eu", toUrl="https://berthub.eu/"}
httpredir{fromUrl="https://www.berthub.eu", toUrl="https://berthub.eu/"}

-- Use the prometheus node exporter:
prometheusExp{url="http://10.0.0.1:9100/metrics", 
checks={{kind="DiskFree", mountpoint="/", gbMin=10},
        {kind="AptPending"},  -- pending security updates
        {kind="Bandwidth", device="enp2s0.9", minMbit=0.1, maxMbit=50},
        {kind="Bandwidth", device="ppp0", minMbit=0.4}
}}

-- mail delivery tests
smtp{server="10.0.0.2", servername="ziggo.berthub.eu", from="bert@hubertnet.nl", 
    to="bert@hubertnet.nl", minCertDays=7}

-- will complain if no 'simplomon test message' messages arrive:
imap{server="10.0.0.2", servername="ziggo.berthub.eu", user="username", 
    password="secret"}

Save this as 'simplomon.conf' and start './simplomon' and you should be in business.

Webserver

If you run Webserver{address="127.0.0.1:8080"}, simplomon will launch a webserver. If you run Simplomon inside a container, you'll probably have to use 0.0.0.0:8080 for things to work.

You can also add user="something", password=... (password in quotes).

This server supports the following three endpoints. The first, '/health' is always enabled. The dashboard and other JSON endpoints only function if you supplied a user name and a password, which will be checked using basic auth.

If you load / in a webserver you get a somewhat nice dashboard with metrics.

Datalogger

If you add Logger{filename="stats.sqlite3"}, simplomon will populate a SQLite database (in the file './stats.sqlite3') with lots of interesting metadata, one table per checker kind.

In addition, all alert reports will be logged in the 'reports' table, even if they did not lead to notifications.

Todo

Docker

There is an image on the Docker hub which you can pull (berthubert/simplomon).

The image will read its configuration file from the HTTPS URL supplied in the SIMPLOMON_CONFIG_URL environment variable.

Do make sure to set Webserver{address="0.0.0.0:8080"} in that configuration file, as otherwise your Docker environment may not recognize that simplomon is running.

To build the image yourself, do:

sudo docker build -f Dockerfile.full-build .

This implements a two-stage builder to create an image.

You can host the image for example on Scaleway's container service. It makes perfect sense to host your monitoring somewhere outside of your own network. Note that Scaleway sadly has no support for outgoing IPv6.

Running Docker on a server

To run the simplomon service on a Linux server it is easy to wrap a little systemd service definition around it.

[Service] TimeoutStartSec=0 Restart=always ExecStartPre=-/usr/bin/docker exec %n stop ExecStartPre=-/usr/bin/docker rm %n ExecStartPre=/usr/bin/docker pull berthubert/simplomon ExecStart=docker run \ -v /etc/simplomon.conf:/simplomon.conf \ --name %n \ berthubert/simplomon

[Install] WantedBy=default.target

* let systemd reload config with `systemctl daemon-reload`
* enable the service with `systemctl enable simplomon.service`
* start the service with `systemctl start simplomon.service`

checking if it has started correctly:
```# systemctl status simplomon.service
● simplomon.service - Simplomon Monitoring Service
     Loaded: loaded (/etc/systemd/system/simplomon.service; enabled; vendor preset: enabled)
     Active: failed (Result: timeout) since Wed 2024-03-13 21:07:11 UTC; 10min ago
    Process: 368586 ExecStartPre=/usr/bin/docker exec simplomon.service stop (code=exited, status=1/FAILURE)
    Process: 368595 ExecStartPre=/usr/bin/docker rm simplomon.service (code=exited, status=0/SUCCESS)
    Process: 368624 ExecStartPre=/usr/bin/docker pull berthubert/simplomon (code=exited, status=0/SUCCESS)
    Process: 368647 ExecStart=/usr/bin/docker run -v /etc/simplomon.conf:/simplomon.conf --name simplomon.service berthubert/simplomon (code=killed, signal=KILL)
   Main PID: 368647 (code=killed, signal=KILL)

check the log with journalctl -u simplomon.service

notes

Compiling natively

On Debian derived systems the following works:

apt install python3-pip pkg-config libnghttp2-dev libssl-dev liblua5.3-dev xxd

In addition, the project requires a recent version of meson, which you can get with 'pip3 install meson ninja' or perhaps 'pip install meson ninja' and only if that doesn't work 'apt install meson'.

The meson in Debian bullseye is very old, and will give you a confusing error message about 'git' if you try it. If you enable bullseye-backports you can do apt install -t bullseye-backports meson and get a working one. Or use the pip version, which is also great.

Then run:

meson setup build
meson compile -C build

Distributing binaries, docker etc

To make a more portable binary, try:

LDFLAGS="-static-libstdc++ -static-libgcc" meson setup build --prefer-static
meson compile -C build/

Or even a fully static one:

LDFLAGS=-static meson setup build --prefer-static -Dbuildtype=release -Dcpp-httplib:cpp-httplib_openssl=disabled -Dcpp-httplib:cpp-httplib_brotli=disabled

meson compile -C build/

Inspiration