NagiosEnterprises / ncpa

Nagios Cross-Platform Agent
Other
176 stars 95 forks source link

ncpa agent ver 2.4.1 on centos 9 stream - cant start #998

Closed shaiuzi closed 9 months ago

shaiuzi commented 9 months ago

hi , ncpa agent ver 2.4.1 on centos 9 stream - cant start this is what i see on systemctl status ncpa:

ncpa_listener.service: Scheduled restart job, restart counter is at 2. Stopped Nagios ncpa Agent Service. ncpa_listener.service: Start request repeated too quickly. ncpa_listener.service: Failed with result 'exit-code'. Failed to start Nagios ncpa Agent Service

any suggestions will be highly appreciated.

MrPippin66 commented 9 months ago

What does the log show?

shaiuzi commented 9 months ago

only two lines 2023-03-01 11:41:25,007 5321 INFO started ncpa_listener, version: 2.4.1 2023-03-01 11:41:25,007 5321 INFO Using SSL version TLSv1_2

it lokks like from old date. altough to make sure , i have deleted the ncpa folder entierly , and reinstall the agent again. and it fail to start. and those two lines apear again on the log file.

MrPippin66 commented 9 months ago

Okay, can you stop the listener (just to clear the status in systemd), ensure the pid file in /usr/local/ncpa/var/run/ncpa_listener.pid is removed, and then manually start the agent with "/usr/local/ncpa/bin/ncpa_listener -n --start"

Note any data returned.

If it exits, run "echo $?" to get the return code.

shaiuzi commented 9 months ago

hi mr Pippin66 ,thanks for your assist. first of all ,no exeutable needed to be stop because it cant be start. anyway , i did executed stop listener anyway. in run directory - only passive.pid was there. i removed that pid file. then i executed the bin as you said. in centos 9 the correct path is /usr/local/ncpa/ncpa_listener. it show me the same tw=wo lines as i wrote before - 2023-10-26 08:12:08,588 60443 INFO started ncpa_listenecho $?er, version: 2.4.1 2023-10-26 08:12:08,590 60443 INFO Using SSL version TLSv1_2

im not exactly understood where to put this line - echo $?

MrPippin66 commented 9 months ago

just run "echo $?" after the previous command completes. It's the exit code of the previous command.

shaiuzi commented 9 months ago

/usr/local/ncpa/ncpa_listener -n --start 2023-10-26 15:26:17,134 89570 INFO started ncpa_listener, version: 2.4.1 2023-10-26 15:26:17,134 89570 INFO Using SSL version TLSv1_2 echo $?

i did , and its stays without exit

MrPippin66 commented 9 months ago

Hmmm. that's interesting.

Okay, so with the service stopped (ncpa_listener), does this result in the service running?

/etc/rc.d/init.d/ncpa_listener start

If so, implies a systemd issue is going on

shaiuzi commented 9 months ago

im executing it with - /etc/init.d/ncpa_listener start Started NCPA Listener so it show a line indicate service has started. but if i check systemctl status ncpa_listener it shows me service is enav=bled but stopped:

ncpa_listener.service: Scheduled restart job, restart counter is at 2. Stopped Nagios ncpa Agent Service. ncpa_listener.service: Start request repeated too quickly. ncpa_listener.service: Failed with result 'exit-code'. Failed to start Nagios ncpa Agent Service.

MrPippin66 commented 9 months ago

It won't show in systemctl, because this is bypassing systemd.

If after you run this, see if the process is running.

shaiuzi commented 9 months ago

121853 nagios 20 0 215472 42028 5928 S 0.0 1.1 0:00.01 ncpa_listener

bingo , now it works 👍 many thanks MrPippin66 is there a workaround to fix the systemd iisue?

MrPippin66 commented 9 months ago

This indicates a systemd level issue is occurring.

NCPA uses SYSV init style management scripts that leverage a systemd interface that generates a runtime unit file.

On your system, I'll assume the runtime generated version is located at '/run/systemd/generator.late/ncpa_passive.service'

Please post the contents, here.

shaiuzi commented 9 months ago

im on centos 9 there are four config files for both listener and passive service at this locations:

/etc/systemd/system/multi-user.target.wants/ncpa_listener.service /etc/systemd/system/ncpa_listener.service /usr/lib/systemd/system/ncpa_listener.service /usr/local/ncpa/build_resources/ncpa_listener.service

all of them with config like this inside:

[Unit] Description=NCPA Listener Documentation=https://www.nagios.org/ncpa After=network.target local-fs.target

[Service] ExecStart=/usr/local/ncpa/ncpa_listener -n

[Install] WantedBy=multi-user.target

MrPippin66 commented 9 months ago

Hmmm, how did you get the package? Did you build it, yourself?

shaiuzi commented 9 months ago

first i nstalled it from repo , and i also have tried the manual build process/ both give the same behavior.

MrPippin66 commented 9 months ago

Which manual build process? The current one has a special exception for Centos 9.

But in either case, you also used the el9 NCPA version from the repo? I'm using RHEL, so the behavior seems different from you.

For the repo install version, can you run "systemctl show ncpa_listener" and post the results?

shaiuzi commented 9 months ago

ExitType=main Restart=on-failure NotifyAccess=none RestartUSec=100ms TimeoutStartUSec=1min 30s TimeoutStopUSec=1min 30s TimeoutAbortUSec=1min 30s TimeoutStartFailureMode=terminate TimeoutStopFailureMode=terminate RuntimeMaxUSec=infinity RuntimeRandomizedExtraUSec=0 WatchdogUSec=0 WatchdogTimestampMonotonic=0 RootDirectoryStartOnly=no RemainAfterExit=no GuessMainPID=yes MainPID=0 ControlPID=0 FileDescriptorStoreMax=0 NFileDescriptorStore=0 StatusErrno=0 Result=exit-code

MrPippin66 commented 9 months ago

That's all you received? I would have expected significantly more info.

shaiuzi commented 9 months ago

yes ,sorry, thats what im getting when do a systenctl show ncpa_listener i assume i have unknow (yet) bug. is there any output og any logs to send to you for examing this bug?

MrPippin66 commented 9 months ago

The issue I'm trying to determine isn't an issue with "ncpa", but the systemd implementation on your distribution. That's also not implying a bug in your distribution's 'systemd', but a behavioral difference between Centos 9 Stream and RHEL9 in their 'systemd' implementation.

Keep in mind I don't have a Centos 9 Stream environment of my own to validate this against.

Can you try this again with "systemctl show ncpa_listener --full"?

shaiuzi commented 9 months ago

my mistake , it has 247 lines...

Type=simple ExitType=main Restart=on-failure NotifyAccess=none RestartUSec=100ms TimeoutStartUSec=1min 30s TimeoutStopUSec=1min 30s TimeoutAbortUSec=1min 30s TimeoutStartFailureMode=terminate TimeoutStopFailureMode=terminate RuntimeMaxUSec=infinity RuntimeRandomizedExtraUSec=0 WatchdogUSec=0 WatchdogTimestampMonotonic=0 RootDirectoryStartOnly=no RemainAfterExit=no GuessMainPID=yes MainPID=0 ControlPID=0 FileDescriptorStoreMax=0 NFileDescriptorStore=0 StatusErrno=0 Result=exit-code ReloadResult=success CleanResult=success UID=[not set] GID=[not set] NRestarts=2 OOMPolicy=stop ExecMainStartTimestamp=Wed 2023-10-25 22:47:51 IDT ExecMainStartTimestampMonotonic=1852709682 ExecMainExitTimestamp=Wed 2023-10-25 22:47:51 IDT ExecMainExitTimestampMonotonic=1852721843 ExecMainPID=22439 ExecMainCode=1 ExecMainStatus=1 ExecStart={ path=/etc/init.d/ncpa_listener ; argv[]=/etc/init.d/ncpa_listener ; ignore_errors=no ; start_time=[Wed 2023-10-25 22:47:51 IDT] ; stop_time=[Wed 2023-10-25 22:47:51 IDT] ; pid=22439 ; code=exited ; status=1 } ExecStartEx={ path=/etc/init.d/ncpa_listener ; argv[]=/etc/init.d/ncpa_listener ; flags= ; start_time=[Wed 2023-10-25 22:47:51 IDT] ; stop_time=[Wed 2023-10-25 22:47:51 IDT] ; pid=22439 ; code=exited ; status=1 } Slice=system.slice ControlGroupId=4170 MemoryCurrent=[not set] MemoryAvailable=infinity CPUUsageNSec=5856000 TasksCurrent=[not set] IPIngressBytes=[no data] IPIngressPackets=[no data] IPEgressBytes=[no data] IPEgressPackets=[no data] IOReadBytes=18446744073709551615 IOReadOperations=18446744073709551615 IOWriteBytes=18446744073709551615 IOWriteOperations=18446744073709551615 Delegate=no CPUAccounting=yes CPUWeight=[not set] StartupCPUWeight=[not set] CPUShares=[not set] StartupCPUShares=[not set] CPUQuotaPerSecUSec=infinity CPUQuotaPeriodUSec=infinity IOAccounting=no IOWeight=[not set] StartupIOWeight=[not set] BlockIOAccounting=no BlockIOWeight=[not set] StartupBlockIOWeight=[not set] MemoryAccounting=yes DefaultMemoryLow=0 DefaultMemoryMin=0 MemoryMin=0 MemoryLow=0 MemoryHigh=infinity MemoryMax=infinity MemorySwapMax=infinity MemoryLimit=infinity DevicePolicy=auto TasksAccounting=yes TasksMax=23094 IPAccounting=no ManagedOOMSwap=auto ManagedOOMMemoryPressure=auto ManagedOOMMemoryPressureLimit=0 ManagedOOMPreference=none UMask=0022 LimitCPU=infinity LimitCPUSoft=infinity LimitFSIZE=infinity LimitFSIZESoft=infinity LimitDATA=infinity LimitDATASoft=infinity LimitSTACK=infinity LimitSTACKSoft=8388608 LimitCORE=infinity LimitCORESoft=0 LimitRSS=infinity LimitRSSSoft=infinity LimitNOFILE=524288 LimitNOFILESoft=1024 LimitAS=infinity LimitASSoft=infinity LimitNPROC=14434 LimitNPROCSoft=14434 LimitMEMLOCK=8388608 LimitMEMLOCKSoft=8388608 LimitLOCKS=infinity LimitLOCKSSoft=infinity LimitSIGPENDING=14434 LimitSIGPENDINGSoft=14434 LimitMSGQUEUE=819200 LimitMSGQUEUESoft=819200 LimitNICE=0 LimitNICESoft=0 LimitRTPRIO=0 LimitRTPRIOSoft=0 LimitRTTIME=infinity LimitRTTIMESoft=infinity WorkingDirectory=/etc/init.d OOMScoreAdjust=0 CoredumpFilter=0x33 Nice=0 IOSchedulingClass=2 IOSchedulingPriority=4 CPUSchedulingPolicy=0 CPUSchedulingPriority=0 CPUAffinityFromNUMA=no NUMAPolicy=n/a TimerSlackNSec=50000 CPUSchedulingResetOnFork=no NonBlocking=no StandardInput=null StandardOutput=journal StandardError=inherit TTYReset=no TTYVHangup=no TTYVTDisallocate=no SyslogPriority=30 SyslogLevelPrefix=yes SyslogLevel=6 SyslogFacility=3 LogLevelMax=-1 LogRateLimitIntervalUSec=0 LogRateLimitBurst=0 SecureBits=0 CapabilityBoundingSet=cap_chown cap_dac_override cap_dac_read_search cap_fowner cap_fsetid cap_kill cap_setgid cap_setuid cap_setpcap cap_linux_immutable cap_net_bind_service cap_net_broadcast cap_net_admin cap_net_raw cap_ipc_lock cap_ipc_owner cap_sys_module cap_sys_rawio cap_sys_chroot cap_sys_ptrace cap_sys_pacct cap_sys_admin cap_sys_boot cap_sys_nice cap_sys_resource cap_sys_time cap_sys_tty_config cap_mknod cap_lease cap_audit_write cap_audit_control cap_setfcap cap_mac_override cap_mac_admin cap_syslog cap_wake_alarm cap_block_suspend cap_audit_read cap_perfmon cap_bpf cap_checkpoint_restore User=root DynamicUser=no RemoveIPC=no PrivateTmp=no PrivateDevices=no ProtectClock=no ProtectKernelTunables=no ProtectKernelModules=no ProtectKernelLogs=no ProtectControlGroups=no PrivateNetwork=no PrivateUsers=no PrivateMounts=no PrivateIPC=no ProtectHome=no ProtectSystem=no SameProcessGroup=no UtmpMode=init IgnoreSIGPIPE=yes NoNewPrivileges=no SystemCallErrorNumber=2147483646 LockPersonality=no RuntimeDirectoryPreserve=no RuntimeDirectoryMode=0755 StateDirectoryMode=0755 CacheDirectoryMode=0755 LogsDirectoryMode=0755 ConfigurationDirectoryMode=0755 TimeoutCleanUSec=infinity MemoryDenyWriteExecute=no RestrictRealtime=no RestrictSUIDSGID=no RestrictNamespaces=no MountAPIVFS=no KeyringMode=private ProtectProc=default ProcSubset=all ProtectHostname=no KillMode=control-group KillSignal=15 RestartKillSignal=15 FinalKillSignal=9 SendSIGKILL=yes SendSIGHUP=no WatchdogSignal=6 Id=ncpa_listener.service Names=ncpa_listener.service Requires=-.mount system.slice sysinit.target WantedBy=multi-user.target Conflicts=shutdown.target Before=multi-user.target shutdown.target After=-.mount sysinit.target systemd-journald.socket basic.target system.slice RequiresMountsFor=/etc/init.d Description=Nagios ncpa Agent Service LoadState=loaded ActiveState=failed FreezerState=running SubState=failed FragmentPath=/etc/systemd/system/ncpa_listener.service UnitFileState=enabled UnitFilePreset=disabled StateChangeTimestamp=Wed 2023-10-25 22:47:51 IDT StateChangeTimestampMonotonic=1852960145 InactiveExitTimestamp=Wed 2023-10-25 22:47:51 IDT InactiveExitTimestampMonotonic=1852724381 ActiveEnterTimestamp=Wed 2023-10-25 22:47:51 IDT ActiveEnterTimestampMonotonic=1852710885 ActiveExitTimestamp=Wed 2023-10-25 22:47:51 IDT ActiveExitTimestampMonotonic=1852722713 InactiveEnterTimestamp=Wed 2023-10-25 22:47:51 IDT InactiveEnterTimestampMonotonic=1852956450 CanStart=yes CanStop=yes CanReload=no CanIsolate=no CanFreeze=yes StopWhenUnneeded=no RefuseManualStart=no RefuseManualStop=no AllowIsolate=no DefaultDependencies=yes OnSuccessJobMode=fail OnFailureJobMode=replace IgnoreOnIsolate=no NeedDaemonReload=no JobTimeoutUSec=infinity JobRunningTimeoutUSec=infinity JobTimeoutAction=none ConditionResult=yes AssertResult=yes ConditionTimestamp=Wed 2023-10-25 22:47:51 IDT ConditionTimestampMonotonic=1852959128 AssertTimestamp=Wed 2023-10-25 22:47:51 IDT AssertTimestampMonotonic=1852959133 Transient=no Perpetual=no StartLimitIntervalUSec=30s StartLimitBurst=2 StartLimitAction=none FailureAction=none SuccessAction=none InvocationID=d5256eefeca643749ee091c37148e845 CollectMode=inactive

MrPippin66 commented 9 months ago

The default Restart for systemd should be "off", which seems odd.

There might be other issues, but I'd start with running "systemctl edit ncpa_timer" and add this between the specified comments for an override configuration.

[Service]
Restart=off

Then retry the start operation.

shaiuzi commented 9 months ago

just did what you have wrote , creating the systemd file with restart=off parameter. still , whem i try to start with systemctl - same behavior and it failed.

shaiuzi commented 9 months ago

sorry , let me correct. the service is up , the proces is runing , altough still systemctl show its not active.

MrPippin66 commented 9 months ago

Assuming this is still running via '/etc/rc.d/init.d/ncpa_listener start' just run '/etc/rc.d/init.d/ncpa_listener stop'

That's the legacy SYSV INIT management interface.

Then you can try the "systemd" interface to start/stop it.

shaiuzi commented 9 months ago

many thanks MrPippin66 for you to pointing me to the problem , at least now i know ncpa is working properly if i start it manualy with sysv init.

MrPippin66 commented 9 months ago

Okay, if you consider this resolved, please close the issue.

shaiuzi commented 9 months ago

ok , again, thanks for your great asist

MrPippin66 commented 9 months ago

You're welcome!