mclarkson / nagrestconf

Nagios REST Interface
12 stars 5 forks source link

Installed synagios on Synology 1513+ DSM 5.0 - nothing shows #19

Closed philhu closed 10 years ago

philhu commented 10 years ago

It says it completed, I see the icon. When I click it, blank, the 45 secs, not responding message

I do not see any running daemon when I ssh in.

I do not see a log file to see why it failed. Everything in pkg says installed correctly.

The directory /var/packages/Synagios has a etc link to /usr/syno/etc/packages/Synagios. The directory there is empty.

How can I help debug this?

Some things I looked at: root@DS1513 /var/packages/Synagios/scripts# ./start-stop-status status ./start-stop-status: cd: line 130: can't cd to /nagios-chroot chroot: can't execute '/etc/init.d/npcd': No such file or directory chroot: can't execute '/etc/init.d/apache2': No such file or directory chroot: can't execute '/etc/init.d/cron': No such file or directory chroot: can't execute '/etc/init.d/rsyslog': No such file or directory ./start-stop-status: line 130: can't create : nonexistent directory

root@DS1513 /var/packages/Synagios/scripts# cd /usr/syno/etc/packages/Synagios/ root@DS1513 /usr/syno/etc/packages/Synagios# ls root@DS1513 /usr/syno/etc/packages/Synagios#

mclarkson commented 10 years ago

It sounds like it didn't manage to unpack the files. The files should be in '/volume1/@appstore/Synagios/', with the following files/directories inside:

Synagios@ application.cfg config images/ nagios-chroot/ redirect.cgi*

If the files are there then you should be able to 'chroot /volume1/@appstore/Synagios/nagios-chroot'

Check the disk space with 'df -h /volume1', how much is free? However the installer should tell you if there is not enough space. Fully unpacked it takes up 412MB. Check the size of the directories, mine looks like this:

DiskStation> du -hsc /volume1/@appstore/Synagios/* 0 /volume1/@appstore/Synagios/Synagios 4.0K /volume1/@appstore/Synagios/application.cfg 4.0K /volume1/@appstore/Synagios/config 116.0K /volume1/@appstore/Synagios/images 625.1M /volume1/@appstore/Synagios/nagios-chroot 4.0K /volume1/@appstore/Synagios/redirect.cgi 625.2M total

Ensure you have the latest Synagios package, 'synagios_0.13_x86.spk'.

Try reinstalling. Maybe the unpack failed silently somehow, but stop other services before installing then restart them afterwards.

philhu commented 10 years ago

can I uninstall and try again? How would I uninstall?

I think the problem was that I tried to do the manual install remotely from my local browser with the spk file to the machinr in my house using the ds1513 gui

There was no way it sent 143M to the machine in my house in 35 seconds, so I will try tonight from scratch with local lan machines.

mclarkson commented 10 years ago

If it shows in the package manager then you should be able uninstall it using the package manager.

It's a standard Synology package so follows the same rules.

pekholm commented 10 years ago

I have the same exact problem and symptom as philhu.

I'm running a DS411+ on on DSM 5.0-4482 800GB available. The package installed was 'synagios_0.13_x86.spk'

The difference in total install size i have is smaller than what mclarkson had. See below: du -hsc /volume1/@appstore/Synagios/* 4.0K /volume1/@appstore/Synagios/application.cfg 4.0K /volume1/@appstore/Synagios/config 116.0K /volume1/@appstore/Synagios/images 455.7M /volume1/@appstore/Synagios/nagios-chroot 4.0K /volume1/@appstore/Synagios/redirect.cgi 455.8M total

mclarkson commented 10 years ago

Maybe there's been a change in a recent DSM update. I'll uninstall mine later and try to reinstall. I've found my diskstation is becoming more sensitive under load recently. Whereas I could run symform a few months ago, now I can't without it consuming all my resources and affecting cloudstation etc, and after a couple weeks of trying I've had to uninstall it. Anyway, that's maybe related, maybe not but I'll give it a try tonight.

Thanks very much for your reports.

mclarkson commented 10 years ago

I just tried installing on a DS112 and a DS412+ and I had no problems, also using DSM 5.0-4482. I have no idea what is wrong.

Did you try the 'chroot /volume1/@appstore/Synagios/nagios-chroot' command? If it works do a 'ps axf' to make sure - 'ps axf' won't work outside of the chroot.

If that worked then your box is probably overloaded. Stop all other services then stop, then restart synagios, then click the Synagios icon.

Note that you can't tell if the box is overloaded by looking at the CPU and memory usage in the DSM GUI. Most likely the load and wait times are high, use 'top'.

pekholm commented 10 years ago

as root on the box:

BusyBox v1.16.1 (2014-04-18 02:33:51 CST) built-in shell (ash) Enter 'help' for a list of built-in commands.

Loke> chroot /volume1/@appstore/Synagios/nagios-chroot ash: warning: setlocale: LC_ALL: cannot change locale (enUS.utf8) root@Loke:/# ps axf PID TTY STAT TIME COMMAND 2 ? S 0:00 [kthreadd] 3 ? S 0:11 [ksoftirqd/0] 5 ? S 0:00 [kworker/u:0] 6 ? S 0:10 [migration/0] 7 ? S 0:03 [migration/1] 9 ? S 0:10 [ksoftirqd/1] 11 ? S 0:10 [migration/2] 13 ? S 0:10 [ksoftirqd/2] 14 ? S 0:11 [migration/3] 16 ? S 0:10 [ksoftirqd/3] 17 ? S< 0:00 [khelper] 18 ? S 0:00 [kworker/u:1] 142 ? S 0:00 _ [syncsupers] 144 ? S 0:00 [bdi-default] 145 ? S< 0:00 [kintegrityd] 146 ? S< 0:00 [kblockd] 251 ? S< 0:00 _ [atasff] 259 ? S< 0:00 [md] 367 ? S< 0:00 [rpciod] 524 ? S< 0:46 [kswapd0] 528 ? S 0:00 _ [fsnotifymark] 534 ? S< 0:00 [nfsiod] 2874 ? S< 0:00 _ [iscsieh] 2957 ? S 0:00 [scsi_eh0] 2968 ? S 0:00 [scsi_eh1] 2977 ? S 0:00 [scsi_eh2] 2988 ? S 0:00 [scsi_eh3] 2998 ? S 0:00 [scsi_eh4] 3005 ? S 0:00 [scsi_eh5] 3414 ? S 0:00 [scsi_eh6] 3604 ? S 0:16 [md0raid1] 3627 ? S 0:04 [md1raid1] 3681 ? S 0:00 [flush-9:0] 3717 ? D 0:07 [jbd2/md0-8] 3718 ? S< 0:00 [ext4-dio-unwrit] 5097 ? S 0:00 [khubd] 5107 ? S 0:00 [kethubd] 5272 ? S< 0:00 [crypto] 5640 ? S 0:00 [ecryptfs-kthrea] 6148 ? S 0:43 _ [md2raid5] 6226 ? S 0:31 [md3raid5] 6378 ? S< 0:00 [kdmflush] 6877 ? S 0:16 [jbd2/dm-0-8] 6879 ? S< 0:00 [ext4-dio-unwrit] 6895 ? S 0:12 [flush-253:0] 8747 ? S 0:00 [scsi_eh7] 8779 ? S 0:00 [usb-storage] 8813 ? S 0:00 _ [scsi_eh8] 8826 ? S 0:16 [usb-storage] 24240 ? S 0:00 [kworker/0:1] 31334 ? S 0:00 [kworker/2:0] 32269 ? S 0:00 [kworker/1:2] 1888 ? S 0:00 [kworker/2:2] 1921 ? S 0:00 [kworker/3:2] 3112 ? S 0:00 [kworker/0:2] 3179 ? S 0:00 [kworker/1:1] 7300 ? S 0:00 [kworker/3:0] 10022 ? S 0:00 [kworker/1:0] 1 ? Ss 0:05 /sbin/init 3738 ? Ssl 0:00 /usr/sbin/syslog-ng -F 5000 ? SNs 0:00 /usr/syno/bin/synologrotated 5005 ? Ss 0:00 /usr/sbin/crond 5046 ? Ss 0:04 /usr/syno/sbin/dbus-daemon --session --fork --print-address 5053 ? Ss 0:00 /usr/syno/sbin/dbus-daemon --system --nopidfile 5124 ? Ss 0:05 /usr/syno/sbin/synologarchd -f 5131 ? Ss 0:00 /usr/syno/sbin/sshd 2354 ? Ss 0:00 sshd: admin [priv]
2378 ? S 0:00 | sshd: admin@pts/3
2379 ? Ss+ 0:00 |
-sh 10035 ? Ss 0:00 sshd: root@pts/1
10041 ? Ss 0:00
-ash 10106 ? S 0:00 /bin/ash -i 10301 ? R+ 0:00 ps -axf 6004 ? Ss 0:04 /usr/sbin/ntpd -p /var/run/ntpd.pid -g 6769 ? S<s 0:00 /usr/syno/bin/findhostd 7102 ? Ssl 5:51 scemd 10758 ttyS0 Ss+ 0:00 /sbin/getty 115200 console 11413 ? S 0:00 /usr/syno/sbin/cnid_metad -l log_error 11431 ? SNs 0:00 /usr/syno/sbin/synomkflvd 11433 ? SNs 0:00 /usr/syno/sbin/synomkthumbd 11473 ? S 0:00 avahi-daemon: running [Loke.local] 11474 ? Ss 0:16 /usr/syno/sbin/hotplugd 11477 ? S 0:04 /usr/syno/sbin/afpd -g guest -c 256 -n Loke AFPServer -l default log_erro 11512 ? Ss 0:03 /usr/syno/bin/synobackupd 11515 ? Ss 0:00 /usr/syno/bin/imgbackupd 11527 ? Ss 0:04 /usr/syno/sbin/nmbd -D 11539 ? Ss 0:00 /usr/sbin/inetd 11565 ? Ss 0:03 /usr/bin/httpd 11580 ? S 0:00 /usr/bin/httpd 11581 ? Sl 0:00 /usr/bin/httpd 11677 ? S 0:00 /usr/bin/postgres -D /var/services/pgsql 11857 ? Ss 0:00 postgres: checkpointer process
11858 ? Ss 0:00 postgres: writer process
11859 ? Ss 0:00
postgres: wal writer process
12210 ? Ss 0:00 postgres: postgres synolog [local] idle 14797 ? Ss 0:00 postgres: postgres download [local] idle 15022 ? Ss 0:00 postgres: postgres download [local] idle 11679 ? S 2:21 /usr/syno/sbin/snmpd -Ln -c /usr/syno/etc/snmpd.conf -p /var/run/snmpd.pi 11866 ? Ss 0:00 /usr/syno/sbin/smbd -F 12160 ? S 0:00 /usr/syno/sbin/smbd -F 25916 ? S 0:00 /usr/syno/sbin/smbd -F 11871 ? Ss 0:00 /usr/syno/sbin/cupsd -C /usr/local/cups/cupsd.conf 12046 ? SNs 0:00 /usr/syno/sbin/fileindexd 12156 ? SNs 0:00 /usr/syno/sbin/synoindexd 15287 ? SN 0:00 /usr/syno/sbin/synoindexscand 15288 ? SN 0:00 /usr/syno/sbin/synoindexworkerd 15289 ? SN 0:00 /usr/syno/sbin/synoindexplugind 15290 ? SN 0:00 /usr/syno/sbin/synomediaparserd 12209 ? Ss 0:00 /usr/syno/sbin/synologd 12223 ? SNl 0:00 /usr/syno/bin/isccore 12443 ? SN 0:00 /usr/syno/bin/iss 12482 ? Ss 0:17 /usr/syno/sbin/synosnmpcd 12555 ? S<s 0:03 /usr/bin/httpd -DSSL -DSPDY -f /etc/httpd/conf/httpd.conf-sys 12558 ? S 0:46 /usr/bin/httpd -DSSL -DSPDY -f /etc/httpd/conf/httpd.conf-sys 12609 ? S<l 4:27 _ /usr/bin/httpd -DSSL -DSPDY -f /etc/httpd/conf/httpd.conf-sys 12587 ? Ss 0:01 /usr/syno/sbin/minissdpd -i eth0 13857 ? SNs 139:51 /bin/ntfs-3g -o uid=1024,gid=100,bigwrites /dev/sdq1 /volumeUSB1/usbshar 14791 ? Ss 0:00 /var/packages/DownloadStation/target/sbin/scheduler 14849 ? Sl 0:35 /var/packages/Plex Media Server/target/Plex Media Server 14903 ? SNl 5:19 Plex Plug-in [com.plexapp.system] /volume1/Plex/Library/Application S 15379 ? Sl 1:05 /volume1/@appstore/Plex Media Server/Plex DLNA Server 15012 ? Ssl 0:36 /var/packages/DownloadStation/target/sbin/transmissiond 15093 ? Sl 9:47 /var/packages/JavaManager/target/Java/jre/bin/java -jar lib/ace.jar start 15190 ? Sl 13:22 bin/mongod --dbpath /volume1/@appstore/UniFi/data/db --port 27117 --l 8764 ? Sl 0:00 /usr/sbin/rsyslogd -c5 8802 ? Ss 0:00 /usr/sbin/cron 8817 ? Ss 0:00 /usr/sbin/apache2 -k start 8828 ? S 0:00 /usr/sbin/apache2 -k start 8829 ? S 0:00 /usr/sbin/apache2 -k start 8830 ? S 0:00 /usr/sbin/apache2 -k start 8831 ? S 0:00 /usr/sbin/apache2 -k start 8832 ? S 0:00 /usr/sbin/apache2 -k start 9807 ? S 0:00 /usr/sbin/apache2 -k start 9836 ? S 0:00 /usr/sbin/apache2 -k start 9837 ? S 0:00 /usr/sbin/apache2 -k start 9840 ? S 0:00 /usr/sbin/apache2 -k start 9841 ? S 0:00 /usr/sbin/apache2 -k start 8827 ? S 0:00 /usr/sbin/npcd -d -f /etc/pnp4nagios/npcd.cfg root@Loke:/#

I don't think the box is overloaded, top of top shows: top - 22:18:00 up 23:36, 0 users, load average: 1.47, 1.43, 1.44

What else could be wrong?

Thanks of all the help!

mclarkson commented 10 years ago

Strictly speaking it is overloaded (but is about right for a diskstation), but need to see the cpu wait - that affects the diskstation the most. Anyway, looks like it might be running. Paste the following line in the diskstation, not in the chroot (sorry it's so long but I didn't have time to reduce it!):

ls -ld /proc/[0-9]*/cwd 2>/dev/null | grep Synagios | grep -v $$ | while read a; do echo "$a" | sed -n 's#.*/proc/\([0-9]*\).*#\1#p'; done | while read a; do ps | grep "$a" | grep -v grep; done

which, on my running synagios, produces:

11858 33 31372 S /usr/sbin/apache2 -k start 12245 nzbget 13412 S N /usr/sbin/nagios3 -d /etc/nagios3/nagios.cfg 5826 root 28852 S /usr/sbin/rsyslogd -c5 5862 root 2232 S /usr/sbin/cron 5874 root 31340 S /usr/sbin/apache2 -k start 5881 nzbget 18596 S /usr/sbin/npcd -d -f /etc/pnp4nagios/npcd.cfg 5883 33 31732 S /usr/sbin/apache2 -k start 5884 33 31564 S /usr/sbin/apache2 -k start 5885 33 33008 S /usr/sbin/apache2 -k start 5886 33 31708 S /usr/sbin/apache2 -k start 5887 33 33008 S /usr/sbin/apache2 -k start 6370 33 31704 S /usr/sbin/apache2 -k start 6409 33 31552 S /usr/sbin/apache2 -k start

Basically everything running in the chroot.

mclarkson commented 10 years ago

Also... if you have BitTorrent Sync installed then change the port from 8888 to something else when installing synagios.

pekholm commented 10 years ago

Yes, I installed on a different port (18888) just to see if it would get better. I also shut everything (almost) off to see if things improved, but they didn't.

Ok, so I tried the command you suggested:

Loke> ls -ld /proc/[0-9]/cwd 2>/dev/null | grep Synagios | grep -v $$ | while read a; do echo "$a" | sed -n 's#./proc/([0-9]).#\1#p'; done | while read a; do ps | grep "$a" | grep -v grep; done 10728 root 28852 S /usr/sbin/rsyslogd -c5 10755 root 2232 S /usr/sbin/cron 10768 root 31340 S /usr/sbin/apache2 -k start 10776 102 2096 S /usr/sbin/npcd -d -f /etc/pnp4nagios/npcd.cfg 10778 33 31372 S /usr/sbin/apache2 -k start 10779 33 31372 S /usr/sbin/apache2 -k start 10780 33 31372 S /usr/sbin/apache2 -k start 10781 33 31372 S /usr/sbin/apache2 -k start 10782 33 31372 S /usr/sbin/apache2 -k start Loke>

Looks like nagios itself is not running.

How is it supposed to be started? And how can I debug the cause of the failure if it does try to start?

Once again, thanks for all the help, I really want this to work!

/P

pekholm commented 10 years ago

Nagios3 is not happy on startup, here's the output from the nagios3 validation:

root@Loke:/etc/nagios3# nagios3 -v /etc/nagios3/nagios.cfg

Nagios Core 3.4.1 Copyright (c) 2009-2011 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 05-11-2012 License: GPL

Website: http://www.nagios.org Reading configuration data... Read main config file okay... Processing object config directory '/etc/nagios3/objects/local'... Processing object config directory '/etc/nagios3/objects/local/setup'... Processing object config directory '/etc/nagios3/objects/local/versions'... Processing object config directory '/etc/nagios3/objects/local/setup.known_good'... Read object config files okay...

Running pre-flight check on configuration data...

Checking services... Error: There are no services defined! Checked 0 services. Checking hosts... Error: There are no hosts defined! Checked 0 hosts. Checking host groups... Checked 0 host groups. Checking service groups... Checked 0 service groups. Checking contacts... Error: There are no contacts defined! Checked 0 contacts. Checking contact groups... Checked 0 contact groups. Checking service escalations... Checked 0 service escalations. Checking service dependencies... Checked 0 service dependencies. Checking host escalations... Checked 0 host escalations. Checking host dependencies... Checked 0 host dependencies. Checking commands... Checked 0 commands. Checking time periods... Checked 0 time periods. Checking for circular paths between hosts... Checking for circular host and service dependencies... Checking global event handlers... Checking obsessive compulsive processor commands... Checking misc settings...

Total Warnings: 0 Total Errors: 3

***> One or more problems was encountered while running the pre-flight check...

 Check your configuration file(s) to ensure that they contain valid
 directives and data defintions.  If you are upgrading from a previous
 version of Nagios, you should be aware that some variables/definitions
 may have been removed or modified in this version.  Make sure to read
 the HTML documentation regarding the config files, as well as the
 'Whats New' section to find out what has changed.

root@Loke:/etc/nagios3#

pekholm commented 10 years ago

Output from nagios.log:

root@Loke:/var/log/nagios3# more nagios.log 
[1402178534] Nagios 3.4.1 starting... (PID=12144)
[1402178534] Local time is Sat Jun 07 15:02:14 PDT 2014
[1402178534] LOG VERSION: 2.0
[1402178534] npcdmod: Copyright (c) 2008-2009 Hendrik Baecker (andurin@process-zero.de) - http://www
.pnp4nagios.org
[1402178534] npcdmod: /etc/pnp4nagios/npcd.cfg initialized
[1402178534] npcdmod: spool_dir = '/var/spool/pnp4nagios/npcd/'.
[1402178534] npcdmod: perfdata file '/var/spool/pnp4nagios/nagios/perfdata.dump'.
[1402178534] npcdmod: Ready to run to have some fun!
[1402178534] Event broker module '/usr/lib/pnp4nagios/npcdmod.o' initialized successfully.
[1402178534] Error: There are no services defined!
[1402178534] Error: There are no hosts defined!
[1402178534] Error: There are no contacts defined!
[1402178534] Bailing out due to errors encountered while running the pre-flight check.  Run Nagios f
rom the command line with the -v option to verify your config before restarting. (PID=12144)
[1402178534] npcdmod: If you don't like me, I will go out! Bye.
[1402178534] Event broker module '/usr/lib/pnp4nagios/npcdmod.o' deinitialized successfully.
root@Loke:/var/log/nagios3# 
mclarkson commented 10 years ago

That is all correct! Nagios starts with an invalid configuration - that is, no configuration at all. When a configuration is created in the gui then nagios will be started by cron - hence the reason cron is running.

So, check that apache2 is listening on the right port with 'netstat -l -t -n -p | grep apache2', for example, on my diskstation:

netstat -l -t -n -p | grep apache2
tcp        0      0 0.0.0.0:8888            0.0.0.0:*               LISTEN      5874/apache2

Next, open a browser to 'http://diskstation_ip:port/nagrestconf', and restore the example configuration, then test nagios on 'http://diskstation_ip:port/nagios3'. Of course change 'diskstation_ip' and 'port'.

There's no reason why the GUIs shouldn't work unless the firewall is enabled and blocking the port.

Finally, there are two places to open the nagrestconf GUI when using the DSM GUI.

  1. Clicking the Synagios icon in Main Menu -- the window will be embedded in the DSM GUI
  2. Opening Package Center, clicking Synagios, then clicking the link under the URL heading -- this opens in a new window

Try both.

pekholm commented 10 years ago

Ah, ok. After loading the sample conf through nagrestconf through http://diskstation_ip:port/nagrestconf. (which was a bit clunky, I did it a couple of times before any data showed up in that GUI). But after that nagios started working and reporting some metrics when accessing thorugh http://diskstation_ip:port/nagios3.

However, accessing the GUI in Synology alway was and still is blank. See attached.

screen shot 2014-06-07 at 5 44 31 pm

I'll restart everything and see if anything changes. Great with progress!

mclarkson commented 10 years ago

Ah. In Chrome you have to click the shield in the right hand side of the address bar and say 'Load unsafe script'. Then it will work - I didn't know this before as I always use Firefox.

After loading from a backup the pane doesn't update itself unfortunately. The user has to click on the tab to refresh the page. Thanks for the feedback there, I should fix that.

pekholm commented 10 years ago

Now it is all running, FF worked fine, but Chrome needed the click on the shield as you said. Funny, the shield is so subtle and it doesn't give any other indications in my setup. And since the nagios3 UI stated that it is not running it clearly took some time to figure out.

philhu who started the thread likely uses Chrome too. It would be cool if Chrome could be detected in the UI of nagrest and indicate what is needed when this happens. Maybe the docs could be updated too for better usability (and less support needs :)

I saw you already filed an issue for the refresh, thanks for making it better! I really appreciate the support.

Now I'm off trying to figure out how to make auto-discovery of my devices to work since I now have nagios running on my only 24/7 device in the house. :) Nmap seems to be the thing to use there... Wonder how to connect that output into Nagios...

After that I want to get as much info as possible from my router. BW used /IP and stuff like that.

Once again thanks!

mclarkson commented 10 years ago

Excellent! I'm glad it works for you now.

Firefox changed recently so it works without intervention, but before it worked the same as Chrome. And when I last tried Chrome, I'm sure it worked the same way as Firefox does now! However, as you say, it's not obvious, which is probably why Firefox dropped it, and I suppose Chrome will change again too. I'm sure others will find this page when they also get a blank screen and I will add this information to a FAQ section.

Many thanks for the feedback - it's much appreciated. The more feedback I get, the more likely I can move synagios/nagrestconf out of beta.

Happy Nagging!