jamesoff / simplemonitor

A Python-based network and host monitor
https://simplemonitor.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
404 stars 165 forks source link

Potentially Slow Startup? #347

Closed Rustymage closed 4 years ago

Rustymage commented 4 years ago

Hi,

Me again, big fan of this software but a little limited on talent when it comes to some issues.

When running simplemonitor as a service, it works as expected but takes a good few minutes to send alerts on monitors I know are offline.

Where can I start with diagnosing any issues? The log doesn't appear to have any errors present. I will upload once I'm home for exact timings.

Pointers are appreciated. I can upload my installed packages later if hat's of any help (they shouldn't have changed from my last issue (https warning))

Cheers

jamesoff commented 4 years ago

Hi!

This is probably a factor of a couple of things:

If you're happy sharing your config (redact anything sensitive!) and the output from running it in debug mode (-d or --log-level=DEBUG), I look to see why it may seem slow.

I also had some thoughts around alerting for things which are broken when SimpleMonitor starts up in my comment in #345. (I can also report that I have a branch which supports reloading the config on SIGHUP, but I'm not sure it's ready for primetime yet :)

Rustymage commented 4 years ago

Debug code below:

I think it's running as expected and slack notifications are slow...

When starting the service it does appear slow but not under the debug running below...

~/jamesoff-simplemonitor-abbfb03 $ python monitor.py -d
SimpleMonitor v1.7
--> Loading main config from monitor.ini
--> Loading monitor config from monitors.ini
Adding host monitor RouterPi: checking host 192.168.1.130 is pingable
Adding http monitor Pihole_page: Checking that accessing http://192.168.1.130/admin/index.php returns HTTP/200 OK
Adding http monitor WakeonLan_page: Checking that accessing https://URL.co.uk returns HTTP/200 OK
Adding http monitor Ombi: Checking that accessing https://URL.co.uk returns HTTP/200 OK
Adding host monitor ZeroPi: checking host 192.168.1.3 is pingable
Adding host monitor VPNPi: checking host 192.168.1.116 is pingable
Adding host monitor ABServer: checking host 192.168.1.125 is pingable
Adding host monitor SabreServer: checking host 192.168.1.115 is pingable
Adding host monitor IntelNuc: checking host 192.168.1.102 is pingable
Adding host monitor TorrentPi: checking host 192.168.1.23 is pingable
--> Loaded 10 monitors.

Adding slack alerter slack
()
--> Starting... (loop runs every 900s) Hit ^C to stop
('\nStarting loop:', ['WakeonLan_page', 'Pihole_page', 'RouterPi', 'Ombi', 'IntelNuc', 'ABServer', 'ZeroPi', 'TorrentPi', 'SabreServer', 'VPNPi'])
Trying: WakeonLan_page
Fail: WakeonLan_page (Got status '502 Bad Gateway' instead of [200])
Trying: Pihole_page
Passed: Pihole_page
Trying: RouterPi
Passed: RouterPi
Trying: Ombi
Passed: Ombi
Trying: IntelNuc
Passed: IntelNuc
Trying: ABServer
Fail: ABServer (Command '['ping', '-c1', '-W5', '192.168.1.125']' returned non-zero exit status 1)
Trying: ZeroPi
Passed: ZeroPi
Trying: TorrentPi
Passed: TorrentPi
Trying: SabreServer
Passed: SabreServer
Trying: VPNPi
Passed: VPNPi
()
WakeonLan_page(default) -> slack(['default'])
  - Notifying alerter: slack
Pihole_page(default) -> slack(['default'])
  - Notifying alerter: slack
RouterPi(default) -> slack(['default'])
  - Notifying alerter: slack
Ombi(default) -> slack(['default'])
  - Notifying alerter: slack
IntelNuc(default) -> slack(['default'])
  - Notifying alerter: slack
ABServer(default) -> slack(['default'])
  - Notifying alerter: slack
ZeroPi(default) -> slack(['default'])
  - Notifying alerter: slack
TorrentPi(default) -> slack(['default'])
  - Notifying alerter: slack
SabreServer(default) -> slack(['default'])
  - Notifying alerter: slack
VPNPi(default) -> slack(['default'])
  - Notifying alerter: slack
~/jamesoff-simplemonitor-abbfb03 $ cat monitor.ini 
[monitor]
interval=900
monitors=monitors.ini

[reporting]
#loggers=logfile
alerters=slack

[logfile]
type=logfile
filename=monitor.log
buffered=1
only_failures=1

[slack]
type=slack
url=https://hooks.slack.com/services/XXXXX/XXXXXXX/XXXXXX
limit=2

#[db]
#type=db
#db_path=/var/www/html/database/monitor.db

#[dbstatus]
#type=dbstatus
#db_path=/var/www/html/database/monitorstatus.db

#[db1]
#type=db
#db_path=/var/www/html/database/Monitor1.db
~/jamesoff-simplemonitor-abbfb03 $ cat monitors.ini 
[RouterPi]
type=host
host=192.168.1.130

[Pihole_page]
type=http
url=http://192.168.1.130/admin/index.php

[WakeonLan_page]
type=http
url=https://URL.co.uk
#verify_hostname=false

[Ombi]
type=http
url=https://URL.co.uk
#verify_hostname=false

[ZeroPi]
type=host
host=192.168.1.3

[VPNPi]
type=host
host=192.168.1.116

[ABServer]
type=host
host=192.168.1.125

[SabreServer]
type=host
host=192.168.1.115

[IntelNuc]
type=host
host=192.168.1.102

#[Test]
#type=command
#command=/opt/vc/bin/vcgencmd measure_temp

[TorrentPi]
type=host
host=192.168.1.23

#[Jackett Indexer]
#type=host
#host=http://192.168.1.23:9117/UI/Dashboard
/etc/systemd/system $ cat simplemonitor.service 
[Service]
Type=simple
ExecStart=/usr/bin/python monitor.py
WorkingDirectory=/home/pi/jamesoff-simplemonitor-abbfb03
Restart=always
StandardOutput=syslog
StandardError=syslog
User=pi

[Install]
WantedBy=multi-user.target

Cheers

jamesoff commented 4 years ago

I think it's running as expected and slack notifications are slow...

--> Starting... (loop runs every 900s) Hit ^C to stop

[slack]
type=slack
url=https://hooks.slack.com/services/XXXXX/XXXXXXX/XXXXXX
limit=2

Yeah, your Slack alerter needs two failures before it will trigger, and you're only running one loop every 15 mins, so that's 30 mins from start before the alert :)

Rustymage commented 4 years ago

Thanks for the clarification!