NagiosEnterprises / ncpa

Nagios Cross-Platform Agent
Other
176 stars 94 forks source link

Install on OSX 10.6.8 #268

Closed kingdingaling83 closed 7 years ago

kingdingaling83 commented 7 years ago

seems to install fine.

but when i check the port is listening nothing

here's the error system.log:Dec 6 17:33:44 gry-svr001 com.apple.launchd[1] (com.nagios.ncpa.passive[6785]): Job appears to have crashed: Illegal instruction system.log:Dec 6 17:33:47 gry-svr001 com.apple.launchd[1] (com.nagios.ncpa.listener[6787]): Job appears to have crashed: Illegal instruction

jomann09 commented 7 years ago

Can you check if there is anything in /usr/local/ncpa/var/log/ncpa_listener.log ?

Also, which version are you installing?

Edit: It looks like our current builds are built against 10.11.x which means it probably won't work on that version. I can look for a lower version of OS X and see if we can't build for it.

kingdingaling83 commented 7 years ago

just tried to install on a 10.10.5 admin@gry-svr004(/var/log):cat /var/log/system.log | grep ncpa Dec 6 17:44:07 gry-svr004 com.apple.xpc.launchd[1] (com.nagios.ncpa.passive): The UserName key is not supported for non-System services. Dec 6 17:44:07 gry-svr004 com.apple.xpc.launchd[1] (com.nagios.ncpa.passive): The GroupName key is not supported for non-System services. Dec 6 17:44:07 gry-svr004 com.apple.xpc.launchd[1] (com.nagios.ncpa.passive[65428]): Service exited with abnormal code: 1 Dec 6 17:44:13 gry-svr004 com.apple.xpc.launchd[1] (com.nagios.ncpa.passive): Service only ran for 6 seconds. Pushing respawn out by 4 seconds. Dec 6 17:44:17 gry-svr004 com.apple.xpc.launchd[1] (com.nagios.ncpa.passive[65433]): Service exited with abnormal code: 1 Dec 6 17:45:22 gry-svr004.hogarthww.prv sudo[65439]: admin : TTY=ttys000 ; PWD=/private/var/log ; USER=root ; COMMAND=/usr/bin/grep -ir ncpa CDIS.custom DiagnosticMessages accountpolicy.log accountpolicy.log.0.gz accountpolicy.log.1.gz accountpolicy.log.2.gz accountpolicy.log.3.gz accountpolicy.log.4.gz accountpolicy.log.5.gz accountpolicy.log.6.gz alf.log amavis.log apache2 asl authd.log authd.log.0.gz awlogs bluetooth.pklg clamav.log com.apple.clouddocs.asl com.apple.revisiond com.apple.xpc.launchd commerce.log coreduetd.log cups daily.out diskspacemonitor.log displaypolicyd.log displaypolicyd.stdout.log emond fax freshclam.log fsck_hfs.log hdiejectd.log hwmond.log hwmond.log.0.bz2 hwmond.log.1.bz2 hwmond.log.2.bz2 hwmond.log.3.bz2 hwmond.log.4.bz2 hwmond.log.5.bz2 install.log install.log.0.bz2 install.log.1.bz2 install.log.2.bz2 install.log.3.bz2 install.log.4.bz2 install.log.5.bz2 kernel-shutdown.log kernel.log.0.bz2 kernel.log.1.bz2 kernel.log.2.bz2 kernel.log.3.bz2 kernel.log.4.bz2

no log for either

cat /usr/local/ncpa/var/log/ncpa_listener.log cat: /usr/local/ncpa/var/log/ncpa_listener.log: No such file or directory

jomann09 commented 7 years ago

The second error is interesting, considering that they should be in /Library/LaunchDaemons which I am under the impression runs it as a system service. Any idea why it would consider it a non-System service? I am going to test this here - I only made minor changes since the last OS X build.

kingdingaling83 commented 7 years ago

thank you so much. we have a wide range of o.s versions here for apple servers and i'd love to use your agent across all of them.

not sure if it's related but xcode isn't installed

happy to provide more logs or run commands for you if need be

jomann09 commented 7 years ago

I don't think it has to do with xcode, but let me do a bit of testing and I will get back to you - I should be able to find some older OS X versions to test on and we can figure out what needs to be built.

jomann09 commented 7 years ago

I've got two quick questions:

  1. Is your OS X Server 10.6.8 running 32bit or 64bit?
  2. What commands did you use to install on the OS X 10.10 system?
kingdingaling83 commented 7 years ago

64bit i followed the steps on your documentation cd /tmp hdiutil attach /tmp/ncpa-.dmg sudo zsh /Volumes/NCPA-/install.sh

jomann09 commented 7 years ago

Thanks for the info, I think there are two issues here that I will address individually:

kingdingaling83 commented 7 years ago

I can confirm it works perfectly on Mac OS X 10.11.5

also, as far as i can tell the three O.S versions here 10.6.8 / 10.10.5 / 10.11.5 are all 64bit versions of the O.S so it's strange to me that 10.11.5 is the only one that can handle it being 32bit

also, i'm pretty sure they run 32 bit apps as we still use them for Final Cut Server which is exclusively 32bit

jomann09 commented 7 years ago

Yeah that is very strange, I know that I have had it install on a 10.10.5 version, in fact I just installed it on our Mac VM host earlier today which is running 10.10.5 and it didn't have issues on there.

I created a new DMG for 2.0.0.a and it's available in the NCPA nightly builds directory since I won't be re-building the 1.8.1 installer. You can try using that, for the 10.10.5 install and see if that solves the issue that was given above.

I was only thinking the 10.6.8 had an issue with it being 32-bit due to the error that it was getting since searching for the Job appears to have crashed: Illegal instruction error shows that it's due to the system version. I'm downloading a 10.6.8 to try on.

kingdingaling83 commented 7 years ago

i'll test it on 10.10.5 and let you know

kingdingaling83 commented 7 years ago

It's working on 10.10.5! thank you so much.

just 10.6.8 left

kingdingaling83 commented 7 years ago

when using the config wizard in nagios xi for some reason i can't see the interfaces to monitor them or the disks. even tho in the npca web GUI i can seem them being monitored. am i missing a step? screen shots below

Nagios Config for 10.10.5 http://postimg.org/image/g1l420aon/

NCPA 10.10.5 disks http://postimg.org/image/wdv5rqp07/

NPCA 10.10.5. network http://postimg.org/image/wdv5rqp07/

Nagios config for a 10.11.5 machine https://postimg.org/image/fagobzhrl/

kingdingaling83 commented 7 years ago

installing your new nightly build on 10.6.8 i managed to get a crash report

gry-svr001:NCPA-2.0.0.a admin$ cat /Users/admin/Library/Logs/DiagnosticReports/ncpa_listener_2016-12-08-143037_gry-svr001.crash Process: ncpa_listener [76815] Path: /usr/local/ncpa/ncpa_listener Identifier: ncpa_listener Version: ??? (???) Code Type: X86-64 (Native) Parent Process: launchd [875]

Date/Time: 2016-12-08 14:30:37.308 +0000 OS Version: Mac OS X Server 10.6.8 (10K549) Report Version: 6

Exception Type: EXC_BREAKPOINT (SIGTRAP) Exception Codes: 0x0000000000000002, 0x0000000000000000 Crashed Thread: 0

Dyld Error Message: Library not loaded: /Library/Frameworks/Python.framework/Versions/2.7/Python Referenced from: /usr/local/ncpa/ncpa_listener Reason: image not found

Binary Images: 0x7fff5fc00000 - 0x7fff5fc3bdef dyld 132.1 (???) <69130DA3-7CB3-54C8-ABC5-423DECDD2AF7> /usr/lib/dyld

gry-svr001:NCPA-2.0.0.a admin$

jomann09 commented 7 years ago

Thank you for the crash report! That'll probably help a lot.

As for the wizard, I believe you are using the old version of the wizard (< 1.4.0) in your Nagios XI so the new NCPA 2 instances aren't reporting. It's an easy fix though, just head over to the Manage Config Wizards page in the admin panel and click "Check for Updates" there should be a version 1.4.1 of the NCPA wizard. You can just click install from the NCPA wizard's row. If it doesn't show up for some reason, I can send you the zip file with the updated wizard manually.

kingdingaling83 commented 7 years ago

great that worked!

kingdingaling83 commented 7 years ago

i installed python 2.7

trying to run: launchctl start com.nagios.ncpa.listener ..as root shows this in /var/log/system

Dec 8 17:00:47 gry-svr001 sudo[81265]: admin : TTY=ttys001 ; PWD=/Volumes/NCPA-2.0.0.a ; USER=root ; COMMAND=/usr/bin/su Dec 8 17:00:53 gry-svr001 com.nagios.ncpa.passive[81281]: 2016-12-08 17:00:53,103 81281 INFO started Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: Traceback (most recent call last): Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cx_Freeze-4.3.4-py2.7-macosx-10.6-intel.egg/cx_Freeze/initscripts/Console.py", line 27, in Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "ncpa_listener.py", line 8, in Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "/Users/ncpa/ncpa/agent/listener/certificate.py", line 1, in Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/OpenSSL/init.py", line 8, in Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/OpenSSL/rand.py", line 12, in Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/OpenSSL/_util.py", line 6, in Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cryptography/hazmat/bindings/openssl/binding.py", line 14, in Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "ExtensionLoader_cryptography_hazmat_bindings__openssl.py", line 25, in Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: File "ExtensionLoader_cryptography_hazmat_bindingsopenssl.py", line 17, in bootstrap Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: ImportError: dlopen(/usr/local/ncpa/_cffi_backend.so, 2): Symbol not found: tlv_bootstrap Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: Referenced from: /usr/local/ncpa/_cffi_backend.so Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: Expected in: /usr/lib/libSystem.B.dylib Dec 8 17:00:56 gry-svr001 com.nagios.ncpa.listener[81287]: in /usr/local/ncpa/_cffi_backend.so Dec 8 17:00:56 gry-svr001 com.apple.launchd[1] (com.nagios.ncpa.listener[81287]): Exited with exit code: 1 sh-3.2#

jomann09 commented 7 years ago

Having looked at some comments about cx_Freeze and some posts about the improper dynamic linking, so instead of using the bundled version of Python it is trying to use the system's version of Python... which doesn't work. I think that the issue is this:

otool -L /usr/local/ncpa/ncpa_*listener
        /Library/Frameworks/Python.framework/Versions/2.7/Python ...
        ...

I think I have this fixed in the new patch that I am applying to the build process but essentially it should be looking for the Python binary in the executable path... so it should look like this:

@executable_path/Python

I can do this from the Makefile and see if it works on my test systems quick and get you a new package to try and install on your 10.6.8 system. If you want to try to change this manually you use the command:

install_name_tool -change /Library/Frameworks/Python.framework/Versions/2.7/Python @executable_path/Python /usr/local/ncpa/ncpa_listener

But you need to have xcode installed I believe.

jomann09 commented 7 years ago

Can you try the latest DMG build from here: https://assets.nagios.com/downloads/ncpa/nightly/

I've re-built it with the changes I noted in the post above. I'm hoping that the 10.6.8 system will be able to find the python binary now.

kingdingaling83 commented 7 years ago

Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: Traceback (most recent call last): Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cx_Freeze-4.3.4-py2.7-macosx-10.6-intel.egg/cx_Freeze/initscripts/Console.py", line 27, in Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "ncpa_listener.py", line 8, in Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "/Users/ncpa/ncpa/agent/listener/certificate.py", line 1, in Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/OpenSSL/init.py", line 8, in Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/OpenSSL/rand.py", line 12, in Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/OpenSSL/_util.py", line 6, in Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/cryptography/hazmat/bindings/openssl/binding.py", line 14, in Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "ExtensionLoader_cryptography_hazmat_bindings__openssl.py", line 25, in Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: File "ExtensionLoader_cryptography_hazmat_bindingsopenssl.py", line 17, in bootstrap Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: ImportError: dlopen(/usr/local/ncpa/_cffi_backend.so, 2): Symbol not found: tlv_bootstrap Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: Referenced from: /usr/local/ncpa/_cffi_backend.so Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: Expected in: /usr/lib/libSystem.B.dylib Dec 8 21:29:07 gry-svr001 com.nagios.ncpa.listener[88524]: in /usr/local/ncpa/_cffi_backend.so Dec 8 21:29:07 gry-svr001 com.apple.launchd[1] (com.nagios.ncpa.listener[88524]): Exited with exit code: 1 Dec 8 21:29:13 gry-svr001 sandboxd[88532]: sshd(88531) deny mach-per-user-lookup

jomann09 commented 7 years ago

Do you happen to have openssl installed and /usr/lib/libffi.dylib on this system?

kingdingaling83 commented 7 years ago

sh-3.2# which openssl /usr/bin/openssl

sh-3.2# ls /usr/lib/libffi.dylib /usr/lib/libffi.dylib

jomann09 commented 7 years ago

Still looks like it's trying to use the wrong Python for running the service... I will have to wait for this 10.6 to get set up to test it more on my end.

kingdingaling83 commented 7 years ago

I'll try again on a server I've not installed previous builds on and let u know

jomann09 commented 7 years ago

Will leave this open for your comment but I removed the 2.0.0 milestone and bug labels for the support label. The first bug for 10.7+ systems is fixed and I am not 100% certain yet that we will be able to support systems that are below 10.7 ... I will continue to work on this issue but it won't be stopping the release.

kingdingaling83 commented 7 years ago

is it the python dependency that is the issue? If it's just a case of having the specific python version i can look into forcing that myself no?

could you list what modules etc the programme needs or would it be too many?

jomann09 commented 7 years ago

Well it shouldn't actually require Python, it's not using the bundled version of Python that comes with NCPA - when you install it, there should be a Python binary called /usr/local/ncpa/Python that the services should be using - that has a compressed library file and uses all those .so files in that directory. However, on 10.6.8 for some reason it can't find that version of Python... so it's trying to use the one that is built in. You can install python via homebrew or the python.org installer and then install the prereqs like so:

python -m pip install gevent gevent-websocket flask jinja2 requests pyOpenSSL psutil

But I am not sure if that will work with the frozen version of NCPA that you are installing.

kingdingaling83 commented 7 years ago

Thanks for all your help

i'm going to use NCPA on all servers/os versions apart from 10.6.8.

I'd like to write plugins/scripts that use NCPS to sent a trap depending on the scrips/plugins. for example wrap an rsync so if it fails it sends an NCPS trap. is there documentation on how to do this?

jomann09 commented 7 years ago

Sounds good, I figured 10.6.x was a bit of a stretch, officially I'm going to label our builds for 10.7+ for the time being.

I'm not sure, can you be a bit more specific? As for NCPS, did you mean SNMP traps? Or can you explain a bit more/link something so I can see.

jomann09 commented 7 years ago

I'm going to close this since I removed 10.6.8 from the supported list. If you want to get more info on how to set up scripts, etc then you should make a post on the Nagios support forum since the whole support team can help out then too - with more than just NCPA.