fort-nix / nix-bitcoin

A collection of Nix packages and NixOS modules for easily installing full-featured Bitcoin nodes with an emphasis on security.
https://nixbitcoin.org
MIT License
515 stars 107 forks source link

Add default monitoring configuration #186

Open reardencode opened 4 years ago

reardencode commented 4 years ago

I don't think that nix-bitcoin should go so far as to install a complete monitoring and visualization stack, but some basic service status alerting could be really useful.

One example of how this can be achieved is:

https://gitlab.com/simple-nixos-mailserver/nixos-mailserver/-/blob/master/default.nix#L576

mmilata commented 4 years ago

Would be nice to provide prometheus exporters, personally I'd run prometheus+grafana on separate machine:

jonasnick commented 4 years ago

As far as I know monit can only send alerts via email. That's easy to do from a mail server, but not for us really.

I have a prometheus + grafana setup with lnd. You can change the package used by the lnd module with the package option in order to use an lnd with the compilation flag enabled. However, the default metrics it exports are not useful. You really want https://github.com/lightninglabs/lndmon too. I'm relatively happy with that setup (though lndmon should export more things), prometheus queries and alerting through grafana is quite powerful.

Perhaps we should 1) make it easy to enable prometheus exporters and 2) add an option that enables a prometheus/grafana setup that "just works".

mmilata commented 4 years ago

Opened nixpkgs PR for the bitcoind exporter: https://github.com/NixOS/nixpkgs/pull/89267 It's probably more convenient to have it in nixpkgs because there's already a bunch of infrastructure for exporters that I'm not sure can be used from the outside. It's not hard to make it into standalone module though, let me know if there's interest.

nixbitcoin commented 4 years ago

ACK standalone module and "just works" setup

mmilata commented 4 years ago

lndmon package + module: NixOS/nixpkgs#89449. Let me know if you'd like to become comaintainer.

nixbitcoin commented 2 years ago

IMO we should limit the scope and just make it a matrix or mail notification whenever a systemd service goes down.

joaothallis commented 4 months ago

IMO we should limit the scope and just make it a matrix or mail notification whenever a systemd service goes down.

Using Prometheus Sever + Systemd exporter + Alertmanager (prometheus) we can achieve mail notification using a standard monitoring stack. What do you think?

This stack is already present in NixOS:

With this stack the project or the user can increment enabling other exporters as the ones mentioned by @mmilata and @jonasnick

I have a prometheus + grafana setup with lnd.

@jonasnick do you have Alertmanager configured to send mail using Nix?

jonasnick commented 4 months ago

@joaothallis That sounds like a reasonable stack. I don't have my setup anymore but I used the alerting system built into grafana.

Would you suggest documenting this stack or add a dedicated module? The advantage of #472 is that it already adds some basic alerting rules.

joaothallis commented 4 months ago

Would you suggest documenting this stack or add a dedicated module? The advantage of #472 is that it already adds some basic alerting rules.

I suggest start documenting how to use this stack first and later we can plan a dedicated module with alerting rules. I will start the documentation and testing it.

I don't have my setup anymore but I used the alerting system built into grafana.

I have working experience with monitoring and creating alerts using Prometheus with Alertmanager but I will search if we should use Grafana Alerting or Alertmanager.