jamesoff / simplemonitor

A Python-based network and host monitor
https://simplemonitor.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
405 stars 166 forks source link

running simplemonitor on Ubuntu Xenial #635

Closed 3c2b2ff5 closed 3 years ago

3c2b2ff5 commented 3 years ago

Hi,

I am trying to run simplemonitor on Ubuntu xenial, I installed Python3.6 from deadsnakes/ppa. Simplemonitor is successfully installed. The ini files look like this now: monitors.ini

[host1.local]
type=ping
host=host1.local.domain
tolerance=2

monitor.ini

[monitor]
interval=60

[reporting]
loggers=logfile
alerters=email

[email]
type=email
host=localhost
from=root
to=me@local.domain

[logfile]
type=logfile
filename=monitor.log
only_failures=0

python3

# which python3
/usr/local/bin/python3
# ll /usr/local/bin/python3
lrwxrwxrwx 1 root root 18 Nov 19 09:37 /usr/local/bin/python3 -> /usr/bin/python3.6*

I am now trying to write the systemd unit:

[Unit]
Description=monitoring service
After=network.target

[Service]
WorkingDirectory=/usr/local/lib/python3.6/dist-packages/simplemonitor
ExecStart=/usr/local/bin/simplemonitor
Restart=on-failure
ExecReload=/bin/kill -HUP $MAINPID
User=root
Group=root

[Install]
WantedBy=multi-user.target

I tried the ExecStart with ExecStart=/usr/bin/python3.6 /usr/local/lib/python3.6/dist-packages/simplemonitor/monitor.py and ExecStart=/usr/local/bin/python3 /usr/local/lib/python3.6/dist-packages/simplemonitor/monitor.py, both don't work and the service exits with failer.

When trying ExecStart=/usr/local/bin/python3 /usr/local/bin/simplemonitor, and ExecStart=/usr/local/bin/simplemonitor the service is running without problems, but it doesn't log any thing. so I don't know if it performs the ping check.

When I run simplemonitor form cli, it logs to monitor.log only when I hit Ctrl+c.

Any idea how to fix this?

Thanks

jamesoff commented 3 years ago

Could you share the error you get when running it with systemd?

If you installed it with pip install simplemonitor (which it looks like from the path you're specifiying in your unit file), then you should just be able to point ExecStart at the path to the simplemonitor executable which gets installed. I'm not sure where that would be on Xenial though. I'll see about getting a VM up and try it out for you.

The default for the file logger is to be buffered, which is why you don't see the content until the program exits. You can add buffered=0 to the [logfile] entry to make it write unbuffered, and you'll see the output in the file immediately.

3c2b2ff5 commented 3 years ago

Hi James,

indeed I set the buffered parameter and it looks great! Thanks. I did installed simplemonitor with pip install simplemonitor so the everything is default on Ubuntu Xenial. With ExecStart=/usr/local/bin/simplemonitor it looks very good.

One more question please, how can I move it to a different location, i.e. /opt/simplemonitor ? I get a problem with the relative import statements, when I run simplemonitor form cli:

:/opt/simplemonitor# simplemonitor 
Traceback (most recent call last):
  File "/usr/local/bin/simplemonitor", line 5, in <module>
    from simplemonitor.monitor import main
  File "/opt/simplemonitor/simplemonitor.py", line 15, in <module>
    from .Alerters.alerter import Alerter
ImportError: attempted relative import with no known parent package

systemd unit

[Unit]
Description=monitoring service
After=network.target

[Service]
WorkingDirectory=/opt/simplemonitor
#WorkingDirectory=/usr/local/lib/python3.6/dist-packages/simplemonitor
ExecStart=/usr/local/bin/simplemonitor
Restart=on-failure
ExecReload=/bin/kill -HUP $MAINPID
User=root
Group=root

[Install]
WantedBy=multi-user.target

systemd failer

:/opt/simplemonitor#  systemctl status simplemonitor.service
● simplemonitor.service - monitoring service
   Loaded: loaded (/etc/systemd/system/simplemonitor.service; enabled; vendor preset: enabled)
   Active: failed (Result: start-limit-hit) since Thu 2020-11-19 11:54:43 CET; 4s ago
  Process: 20934 ExecStart=/usr/local/bin/simplemonitor (code=exited, status=1/FAILURE)
 Main PID: 20934 (code=exited, status=1/FAILURE)

Nov 19 11:54:43 oob01 systemd[1]: simplemonitor.service: Unit entered failed state.
Nov 19 11:54:43 oob01 systemd[1]: simplemonitor.service: Failed with result 'exit-code'.
Nov 19 11:54:43 oob01 systemd[1]: simplemonitor.service: Service hold-off time over, scheduling restart.
Nov 19 11:54:43 oob01 systemd[1]: Stopped monitoring service.
Nov 19 11:54:43 oob01 systemd[1]: simplemonitor.service: Start request repeated too quickly.
Nov 19 11:54:43 oob01 systemd[1]: Failed to start monitoring service.
Nov 19 11:54:43 oob01 systemd[1]: simplemonitor.service: Unit entered failed state.
Nov 19 11:54:43 oob01 systemd[1]: simplemonitor.service: Failed with result 'start-limit-hit'.

Thanks

jamesoff commented 3 years ago

Glad you got it working.

For the move to /opt/simplemonitor, you mean you just want the config etc to be in there, and still run /usr/local/bin/simplemonitor to run the service? That should work, I have similar setups on some of my hosts.

It looks like the output from simplemonitor has scrolled out of systemctl's short history in the status command; could you look in journalctl -u simplemonitor to see if you can see an error from simplemonitor itself about why it's exiting? You can also add -d to the command in the unit file to have simplemonitor produce debug output.

(N.B. not a systemd expert :)

jamesoff commented 3 years ago

I have it working with the setup above (config etc in /opt/simplemonitor) on a Xenial VM. Here's what I did:

  1. Installed Python 3.6 from the deadsnakes PPA. (I installed python3.6-dev.)
  2. Installed pip using https://bootstrap.pypa.io/get-pip.py
  3. Verified pip was the right Python version (3.6), which it was (pip -V)
  4. pip install simplemonitor
  5. mkdir -p /opt/simplemonitor
  6. Copied your config files from above into that dir (I changed the target of the ping to localhost)
  7. Verified the config by (while cd'd into /opt/simplemonitor) running simplemonitor -t -d
  8. Copied your systemd unit file into /opt/simplemonitor/simplemonitor.service
  9. Ran systemctl link /opt/simplemonitor/simplemonitor.service
  10. Ran systemctl start simplemonitor
  11. Checked it with systemctl status simplemonitor and it seemed happy

The monitor.log file in /opt/simplemonitor has the status of the ping monitor in it too.

root@ubuntu:/opt/simplemonitor# ll
total 24
drwxr-xr-x 2 root root 4096 Nov 19 03:26 ./
drwxr-xr-x 3 root root 4096 Nov 19 03:17 ../
-rw-r--r-- 1 root root  202 Nov 19 03:26 monitor.ini
-rw-r--r-- 1 root root  106 Nov 19 03:27 monitor.log
-rw-r--r-- 1 root root   51 Nov 19 03:17 monitors.ini
-rw-r--r-- 1 root root  261 Nov 19 03:25 simplemonitor.service

root@ubuntu:/opt/simplemonitor# cat simplemonitor.service
[Unit]
Description=monitoring service
After=network.target

[Service]
WorkingDirectory=/opt/simplemonitor
ExecStart=/usr/local/bin/simplemonitor -d
Restart=on-failure
ExecReload=/bin/kill -HUP $MAINPID
User=root
Group=root

[Install]
WantedBy=multi-user.target

root@ubuntu:/opt/simplemonitor# systemctl status simplemonitor
● simplemonitor.service - monitoring service
   Loaded: loaded (/opt/simplemonitor/simplemonitor.service; linked; vendor preset: enabled)
   Active: active (running) since Thu 2020-11-19 03:26:51 PST; 1min 47s ago
 Main PID: 10623 (simplemonitor)
   CGroup: /system.slice/simplemonitor.service
           └─10623 /usr/bin/python3.6 /usr/local/bin/simplemonitor -d

Nov 19 03:26:52 ubuntu simplemonitor[10623]: 2020-11-19 03:26:52    DEBUG (simplemonitor) Loop complete
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Running tests
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Starting loop with joblist ['host1.local']
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Trying monitor: host1.local
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52     INFO (simplemonitor) monitor passed: host1.local
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Running recovery
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Running alerts
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) notifying alerter email
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Running logs
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Loop complete

Does that help?

3c2b2ff5 commented 3 years ago

the error is not from simplemonitor, I believe it has to do something with the path. I added export PYTHONPATH=$PYTHONPATH:/opt to .bashrc but still get the error:

# simplemonitor 
Traceback (most recent call last):
  File "/usr/local/bin/simplemonitor", line 5, in <module>
    from simplemonitor.monitor import main
  File "/opt/simplemonitor/simplemonitor.py", line 15, in <module>
    from .Alerters.alerter import Alerter
ImportError: attempted relative import with no known parent package

of course if the path is removed the error looks different:

# simplemonitor 
Traceback (most recent call last):
  File "/usr/local/bin/simplemonitor", line 5, in <module>
    from simplemonitor.monitor import main
ModuleNotFoundError: No module named 'simplemonitor'

Python path

# python3.6
Python 3.6.12 (default, Aug 18 2020, 02:08:22) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> for path in sys.path:
...     print(path)
... 

/opt/simplemonitor
/opt
/usr/lib/python36.zip
/usr/lib/python3.6
/usr/lib/python3.6/lib-dynload
/usr/local/lib/python3.6/dist-packages
/usr/lib/python3/dist-packages
>>>
3c2b2ff5 commented 3 years ago

I have it working with the setup above (config etc in /opt/simplemonitor) on a Xenial VM. Here's what I did:

  1. Installed Python 3.6 from the deadsnakes PPA. (I installed python3.6-dev.)
  2. Installed pip using https://bootstrap.pypa.io/get-pip.py
  3. Verified pip was the right Python version (3.6), which it was (pip -V)
  4. pip install simplemonitor
  5. mkdir -p /opt/simplemonitor
  6. Copied your config files from above into that dir (I changed the target of the ping to localhost)
  7. Verified the config by (while cd'd into /opt/simplemonitor) running simplemonitor -t -d
  8. Copied your systemd unit file into /opt/simplemonitor/simplemonitor.service
  9. Ran systemctl link /opt/simplemonitor/simplemonitor.service
  10. Ran systemctl start simplemonitor
  11. Checked it with systemctl status simplemonitor and it seemed happy

The monitor.log file in /opt/simplemonitor has the status of the ping monitor in it too.

root@ubuntu:/opt/simplemonitor# ll
total 24
drwxr-xr-x 2 root root 4096 Nov 19 03:26 ./
drwxr-xr-x 3 root root 4096 Nov 19 03:17 ../
-rw-r--r-- 1 root root  202 Nov 19 03:26 monitor.ini
-rw-r--r-- 1 root root  106 Nov 19 03:27 monitor.log
-rw-r--r-- 1 root root   51 Nov 19 03:17 monitors.ini
-rw-r--r-- 1 root root  261 Nov 19 03:25 simplemonitor.service

root@ubuntu:/opt/simplemonitor# cat simplemonitor.service
[Unit]
Description=monitoring service
After=network.target

[Service]
WorkingDirectory=/opt/simplemonitor
ExecStart=/usr/local/bin/simplemonitor -d
Restart=on-failure
ExecReload=/bin/kill -HUP $MAINPID
User=root
Group=root

[Install]
WantedBy=multi-user.target

root@ubuntu:/opt/simplemonitor# systemctl status simplemonitor
● simplemonitor.service - monitoring service
   Loaded: loaded (/opt/simplemonitor/simplemonitor.service; linked; vendor preset: enabled)
   Active: active (running) since Thu 2020-11-19 03:26:51 PST; 1min 47s ago
 Main PID: 10623 (simplemonitor)
   CGroup: /system.slice/simplemonitor.service
           └─10623 /usr/bin/python3.6 /usr/local/bin/simplemonitor -d

Nov 19 03:26:52 ubuntu simplemonitor[10623]: 2020-11-19 03:26:52    DEBUG (simplemonitor) Loop complete
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Running tests
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Starting loop with joblist ['host1.local']
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Trying monitor: host1.local
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52     INFO (simplemonitor) monitor passed: host1.local
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Running recovery
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Running alerts
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) notifying alerter email
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Running logs
Nov 19 03:27:52 ubuntu simplemonitor[10623]: 2020-11-19 03:27:52    DEBUG (simplemonitor) Loop complete

Does that help?

That looks great, but my idea was to have the whole simplemonitor package lives in /opt/ not only the config files, like a different installation directory. I don't know if it is possible?

jamesoff commented 3 years ago

Ah I wondered if that was what you wanted :) I think the way to do that is, in /opt/simplemonitor, run pip install -t . simplemonitor, which will install it (and all its dependencies) in there. You could install it in a subdir instead with pip install -t subdir simplemonitor, which will at least stop all the dependencies spamming up the main directory.

You can then run it as /opt/simplemonitor/bin/simplemonitor in the first case, or /opt/simplemonitor/subdir/bin/simplemonitor in the second.

Config files can go in /opt/simplemonitor, and set WorkingDirectory=/opt/simplemonitor in your unit file.

jamesoff commented 3 years ago

Example using the "subdir" version, using sm as the subdir:

root@ubuntu:/opt/simplemonitor# pip install -t sm simplemonitor
Collecting simplemonitor
  Using cached simplemonitor-1.10.0-py3-none-any.whl (119 kB)
[...]

root@ubuntu:/opt/simplemonitor# ll
total 20
drwxr-xr-x  3 root root 4096 Nov 19 03:40 ./
drwxr-xr-x  4 root root 4096 Nov 19 03:40 ../
-rw-r--r--  1 root root  202 Nov 19 03:40 monitor.ini
-rw-r--r--  1 root root   51 Nov 19 03:40 monitors.ini
drwxr-xr-x 48 root root 4096 Nov 19 03:40 sm/

root@ubuntu:/opt/simplemonitor# ./sm/bin/simplemonitor -t -d
2020-11-19 03:40:56     INFO (simplemonitor) === SimpleMonitor v1.10.0
2020-11-19 03:40:56     INFO (simplemonitor) Loading main config from monitor.ini
2020-11-19 03:40:56     INFO (simplemonitor) Loading monitor config from monitors.ini
2020-11-19 03:40:56     INFO (simplemonitor) === Loading monitors
2020-11-19 03:40:56     INFO (simplemonitor) Adding ping monitor host1.local: Checking localhost pings within 5 seconds
2020-11-19 03:40:56     INFO (simplemonitor) --- Loaded 1 monitors
2020-11-19 03:40:56     INFO (simplemonitor) === Loading loggers
2020-11-19 03:40:56     INFO (simplemonitor) Adding logfile logger logfile: Writing log file to monitor.log
2020-11-19 03:40:56     INFO (simplemonitor) --- Loaded 1 loggers
2020-11-19 03:40:56     INFO (simplemonitor) === Loading alerters
2020-11-19 03:40:56     INFO (simplemonitor) Adding email alerter email
2020-11-19 03:40:56     INFO (simplemonitor) --- Loaded 1 alerters
2020-11-19 03:40:56  WARNING (simplemonitor) Config test complete. Exiting.
3c2b2ff5 commented 3 years ago

Great! Exactly what I've been looking for. Great tool and great support. Thank you very much.