centreon / centreon-archived

Centreon is a network, system and application monitoring tool. Centreon is the only AIOps Platform Providing Holistic Visibility to Complex IT Workflows from Cloud to Edge.
https://www.centreon.com
GNU General Public License v2.0
574 stars 240 forks source link

19.04.0 Cannot set downtime or acknowledge #7553

Open Moratorro opened 5 years ago

Moratorro commented 5 years ago

Versions

centreon-trap-19.04.0-3.el7.centos.noarch centreon-plugin-Operatingsystems-Linux-Snmp-20190412-141630.el7.centos.noarch centreon-plugin-Applications-Protocol-Ldap-20190412-141630.el7.centos.noarch centreon-plugin-Applications-Protocol-Http-20190412-141630.el7.centos.noarch centreon-plugin-Applications-Monitoring-Centreon-Central-20190412-141630.el7.centos.noarch centreon-broker-storage-19.04.0-2.el7.centos.x86_64 centreon-engine-19.04.0-2.el7.centos.x86_64 centreon-license-manager-19.04.0-1.el7.centos.noarch centreon-base-config-centreon-engine-19.04.0-3.el7.centos.noarch centreon-release-19.04-1.el7.centos.noarch centreon-clib-19.04.0-1.el7.centos.x86_64 centreon-connector-ssh-19.04.0-2.el7.centos.x86_64 centreon-connector-19.04.0-2.el7.centos.x86_64 centreon-license-manager-common-19.04.0-1.el7.centos.noarch centreon-perl-libs-19.04.0-3.el7.centos.noarch centreon-plugin-Applications-Monitoring-Centreon-Database-20190412-141630.el7.centos.noarch centreon-database-19.04.0-3.el7.centos.noarch centreon-plugin-Operatingsystems-Windows-Snmp-20190412-141630.el7.centos.noarch centreon-plugin-Applications-Monitoring-Centreon-Poller-20190412-141630.el7.centos.noarch centreon-broker-core-19.04.0-2.el7.centos.x86_64 centreon-broker-cbd-19.04.0-2.el7.centos.x86_64 centreon-engine-extcommands-19.04.0-2.el7.centos.x86_64 centreon-broker-cbmod-19.04.0-2.el7.centos.x86_64 centreon-web-19.04.0-3.el7.centos.noarch centreon-pp-manager-19.04.0-3.el7.centos.noarch centreon-auto-discovery-server-19.04.0-4.el7.centos.x86_64 centreon-19.04.0-3.el7.centos.noarch centreon-plugin-Hardware-Ups-Standard-Rfc1628-Snmp-20190412-141630.el7.centos.noarch centreon-plugin-Hardware-Printers-Generic-Snmp-20190412-141630.el7.centos.noarch centreon-plugin-Applications-Protocol-Dns-20190412-141630.el7.centos.noarch centreon-connector-perl-19.04.0-2.el7.centos.x86_64 centreon-broker-19.04.0-2.el7.centos.x86_64 centreon-plugin-Applications-Databases-Mysql-20190412-141630.el7.centos.noarch centreon-common-19.04.0-3.el7.centos.noarch centreon-plugin-Applications-Monitoring-Centreon-Map4-Jmx-20190412-141630.el7.centos.noarch centreon-engine-daemon-19.04.0-2.el7.centos.x86_64 centreon-poller-centreon-engine-19.04.0-3.el7.centos.noarch centreon-plugin-Network-Cisco-Standard-Snmp-20190412-141630.el7.centos.noarch

Operating System Centos 7

Additional environment details (AWS, VirtualBox, physical, etc.): VM instance on google cloud

Description

new centreon installed through rpm and repos. cannot set services downtime or acknowledge no errors seen

Steps to Reproduce

  1. I logged in Centreon
  2. monitoring -> services
  3. select status detail -> service-> service status : all -> select service -> more actions : acknowledge or set downtime .

Describe the received result

nothing. acknowledge or downtime not set. same with hosts

Describe the expected result

be able to acknowledge alarms and set downtime hosts and services

Additional relevant information (e.g. frequency, ...)

i´ve checked another issues wth the same problem, centcore is enabled and running. nothing relevant on the logs but this:

20-May-2019 21:08:18 America/Santiago] PHP Notice: Undefined index: warning in /usr/share/centreon/www/api/class/centreon_topcounter.class.php on line 449

regards

tanguyvda commented 5 years ago

hello, can you please check that the centcore service is running systemctl status centcore (or ps aux | grep -i centcore)

and just in case, restart it.

you can find its logfile in /var/log/centreon/centcore.log for more information. You can also set a debug mode by going in the administration -> parameters -> debug menu (tick the centcore engine debug checkbox and restart centcore when done)

Moratorro commented 5 years ago

Hello and thanks for your reply!

Here is the output

sudo systemctl status centcore
● centcore.service - Centreon Core
   Loaded: loaded (/usr/lib/systemd/system/centcore.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2019-05-21 05:14:20 UTC; 15h ago
 Main PID: 27582 (centcore)
   CGroup: /system.slice/centcore.service
           └─27582 /usr/bin/perl /usr/share/centreon/bin/centcore --logfile=/var/log/centreon/centcore...
May 21 05:14:20 centreonservicerocket systemd[1]: Stopped Centreon Core.
May 21 05:14:20 centreonservicerocket systemd[1]: Started Centreon Core.
ps aux | grep -i centcore
aironbr+ 27199  0.0  0.0 112708   988 pts/0    R+   20:21   0:00 grep --color=auto -i centcore
centreon 27582  0.0  0.4 201572 15660 ?        Ss   05:14   0:08 /usr/bin/perl /usr/share/centreon/bin/centcore --logfile=/var/log/centreon/centcore.log --severity=error --config=/etc/centreon/conf.pm

This is the output after enabling debug

2019-05-21 20:24:30 - INFO - 27339 Receiving order to stop...
2019-05-21 20:24:30 - INFO - Centcore stop...
2019-05-21 20:24:30 - INFO - Enable Debug in Centcore
2019-05-21 20:24:30 - INFO - Instance type: central
2019-05-21 20:24:57 - INFO - External command on Central Server: (1) : "[1558470297] ACKNOWLEDGE_SVC_PROBLEM;AOE_Confluence;AOE Confluence HTTP;2;1;1;admin;Acknowledged by admin
[1558470297] SCHEDULE_FORCED_SVC_CHECK;AOE_Confluence;AOE Confluence HTTP;1558470297
"
2019-05-21 20:25:20 - INFO - External command on Central Server: (1) : "[1558470320] SCHEDULE_SVC_DOWNTIME;AOE_Confluence;AOE Confluence HTTP;1558383900;1558391100;1;0;3600;admin;Downtime set by admin
"
2019-05-21 20:25:57 - INFO - External command on Central Server: (1) : "[1558470357] SCHEDULE_SVC_DOWNTIME;AOE_Confluence;AOE Confluence HTTP;1558383900;1558391100;1;0;3600;admin;downtime

The issue persist. what else can i check? or is this indeed a bug? I will try on a VM on my desktop to see if the same issue occurs(i will install using the iso and another using the repository) If you have any tips on what else to check , let me know

Regards

tanguyvda commented 5 years ago

there's at least one last thing that i'd like to see and this is the output of the following command:

ls -lah /var/lib/centreon-engine/rw/

Moratorro commented 5 years ago

Hello:

Here is the output

drwxrwxr-x 2 centreon-engine centreon-engine 28 May 21 00:18 . drwxr-xr-x 4 centreon-engine centreon-engine 112 May 21 05:14 .. prw-rw---- 1 centreon-engine centreon-engine 0 May 21 20:25 centengine.cmd

Regards

tanguyvda commented 5 years ago

Well, to be sure, can you check that your engine configuration for your central server is set up as follow: (since it is a fresh install it should be like that) image

If this is not the case, change the value, export your configuration by doing a restart (or export the configuration and a service centengine restart on your server)

Moratorro commented 5 years ago

Hello:

The configuration its the same as yours.

[image: image.png]

Regards

On Tue, May 21, 2019 at 5:01 PM tcharles notifications@github.com wrote:

Well, to be sure, can you check that your engine configuration for your central server is set up as follow: (since it is a fresh install it should be like that) [image: image] https://user-images.githubusercontent.com/7352865/58130393-30bafc80-7c1c-11e9-9cc9-5a59599822d1.png

If this is not the case, change the value, export your configuration by doing a restart (or export the configuration and a service centengine restart on your server)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/centreon/centreon/issues/7553?email_source=notifications&email_token=AILKTEDAAW5CJG47AKLDCLTPWRPL3A5CNFSM4HOHQV7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODV5FOXQ#issuecomment-494557022, or mute the thread https://github.com/notifications/unsubscribe-auth/AILKTEEDSF2RHUL2GXFCIYDPWRPL3ANCNFSM4HOHQV7A .

-- Aaron Pavez Bravo X86 HW Expert - Linux Admin

gdelafond commented 5 years ago

@Moratorro is the CBD service running ? systemctl status cbd.service

Moratorro commented 5 years ago

Hi all: Yes the cbd service is running.

cbd.service loaded active running Centreon Broker watchdog centcore.service loaded active running Centreon Core centengine.service loaded active running Centreon Engine httpd24-httpd.service loaded active running The Apache HTTP Server mariadb.service loaded active running MariaDB 10.1.38 database server rh-php71-php-fpm.service loaded active running The PHP FastCGI Process Manager

I think those are all the process belonging to centreon

Let me know if you need anything else

regards

On Wed, May 22, 2019 at 7:52 AM Guillaume de Lafond < notifications@github.com> wrote:

@Moratorro https://github.com/Moratorro is the CBD service running ? systemctl status cbd.service

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/centreon/centreon/issues/7553?email_source=notifications&email_token=AILKTEFP6V2ZLS6W7TJYLGLPWUXZNA5CNFSM4HOHQV7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODV6ZQ2Q#issuecomment-494770282, or mute the thread https://github.com/notifications/unsubscribe-auth/AILKTEABYVPMHICYU7TA26TPWUXZNANCNFSM4HOHQV7A .

-- Aaron Pavez Bravo X86 HW Expert - Linux Admin

adr-mo commented 5 years ago

Hi @Moratorro

Could you please check the logs while trying to set the downtime of the acknowledgement ?

Thanks,

Moratorro commented 5 years ago

Hi again Same as before , i dont get anything but this output:

centcore.log 2019-05-28 22:47:12 - INFO - External command on Central Server: (1) : "[1559083632] SCHEDULE_SVC_DOWNTIME;Qantas_Jira;Qantas Jira HTTP;1558997220;1559004420;1;0;3600;admin;Downtime set by admin " 2019-05-28 22:47:21 - INFO - External command on Central Server: (1) : "[1559083640] ACKNOWLEDGE_SVC_PROBLEM;Qantas_Jira;Qantas Jira HTTP;2;1;1;admin;Acknowledged by admin [1559083640] SCHEDULE_FORCED_SVC_CHECK;Qantas_Jira;Qantas Jira HTTP;1559083640 " 2019-05-28 22:47:26 - INFO - External command on Central Server: (1) : "[1559083646] SCHEDULE_FORCED_SVC_CHECK;Qantas_Jira;Qantas Jira HTTP;1559083646 "

cetengine.log [1559083633] [18267] EXTERNAL COMMAND: SCHEDULE_SVC_DOWNTIME;Qantas_Jira;Qantas Jira HTTP;1558997220;1559004420;1;0;3600;admin;Downtime set by admin [1559083642] [18267] EXTERNAL COMMAND: ACKNOWLEDGE_SVC_PROBLEM;Qantas_Jira;Qantas Jira HTTP;2;1;1;admin;Acknowledged by admin [1559083642] [18267] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;Qantas_Jira;Qantas Jira HTTP;1559083640 [1559083647] [18267] EXTERNAL COMMAND: SCHEDULE_FORCED_SVC_CHECK;Qantas_Jira;Qantas Jira HTTP;1559083646

they show as if they run but the web page doesnt show downtime

when i set downtime from the downtime page

centcore 2019-05-28 22:49:03 - INFO - External command on Central Server: (1) : "[1559083743] SCHEDULE_HOST_DOWNTIME;Qantas_Jira;1558997280;1559004480;1;0;3600;admin;rg [1559083743] SCHEDULE_HOST_SVC_DOWNTIME;Qantas_Jira;1558997280;1559004480;1;0;3600;admin;rg

centengine

[1559083744] [18267] EXTERNAL COMMAND: SCHEDULE_HOST_DOWNTIME;Qantas_Jira;1558997280;1559004480;1;0;3600;admin;rg [1559083744] [18267] EXTERNAL COMMAND: SCHEDULE_HOST_SVC_DOWNTIME;Qantas_Jira;1558997280;1559004480;1;0;3600;admin;rg

regards

On Mon, May 27, 2019 at 3:08 PM Adrien Morais notifications@github.com wrote:

Hi @Moratorro https://github.com/Moratorro

Could you please check the logs while trying to set the downtime of the acknowledgement ?

  • tailf /var/log/centreon/centcore.log
  • tailf /var/log/centreon-engine/centengine.log

Thanks,

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/centreon/centreon/issues/7553?email_source=notifications&email_token=AILKTEFGYBUY3DHF2YDAC73PXQWT7A5CNFSM4HOHQV7KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWKL7TI#issuecomment-496287693, or mute the thread https://github.com/notifications/unsubscribe-auth/AILKTEDLCJDBJRAUQXXXJADPXQWT7ANCNFSM4HOHQV7A .

-- Aaron Pavez Bravo X86 HW Expert - Linux Admin

Moratorro commented 5 years ago

Hi again:

I´ve installed again on a new VM ad google clod from RPM usig this documentation : https://documentation.centreon.com/docs/centreon/en/latest/installation/from_packages.html

Installed as root user, normal user with sudo, and the same thing happens, not able to set downtime and acknowledge.

What else can i check?

regards

lpinsivy commented 5 years ago

Hi,

Are you sure that all your Centreon servers are on the same timezone (same GMT)?

Moratorro commented 5 years ago

Hi Thanks for your comment! I've checked the timezone and yes it wasnt set correctly on the OS or on the .ini file. So i changed the centos 7 timezone to match mine(America/Santiago) and on the php.ini file. On the installation steps it says to add a line to /etc/opt/rh/rh-php71/php.d/php-timezone.ini but that file doesnt exist. so i instead changed it to the php.ini Now i will check if it works. will let you know

thanks!

spsefw commented 5 years ago

I'm experiencing the same issue on our updated installation. /var/log/centreon-engine/centengine.debug shows nothing regarding downtimes.

edit: Timezone is set to Europe/Amsterdam

adr-mo commented 5 years ago

Hi @Moratorro

The file /etc/opt/rh/rh-php71/php.d/php-timezone.ini does not exist on the install. You have to create it.

https://documentation.centreon.com/docs/centreon/en/latest/installation/from_packages.html#setting-the-php-time-zone

The '>' from the command creates the file

Regards

spsefw commented 5 years ago

I'm experiencing the same issue on our updated installation. /var/log/centreon-engine/centengine.debug shows nothing regarding downtimes.

edit: Timezone is set to Europe/Amsterdam

Ok, did some testing, my problem might not be related, I can schedule single hosts and services, but I can't schedule hostgroups and the entire poller options.

When I try a hostgroup, this shows in the centcore.log but the downtime is not set, When i try the poller radiobutton nothing is logged at all.

[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
[1561474777] SCHEDULE_HOST_SVC_DOWNTIME;<HOSTNAME>;1560423600;1560441600;1;0;900;<USERNAME>;Centreon poller upgrade
"
2019-06-25 16:59:37 - ERROR - Ip address not defined for poller 0
mathieuchateau commented 5 years ago

I have same issue since upgrade to 19.04. It was working for 1 year in 2.8 version.

Timezone is set correctly and same accross central and poller, and php:

[root@XXXXX etc]# ls -l localtime
lrwxrwxrwx. 1 root root 34  3 mai    2018 localtime -> ../usr/share/zoneinfo/Europe/Paris
[root@XXXXX etc]# cat /etc/opt/rh/rh-php71/php.d/php-timezone.ini
date.timezone = Europe/Paris

cbd is running

[root@XXXXX centreon]# ls -lah /var/lib/centreon-engine/rw/
total 4,0K
drwxrwxr-x. 2 centreon-engine centreon-engine   55  6 mai   13:41 .
drwxr-xr-x. 4 centreon-engine centreon-engine  112 24 sept. 11:04 ..
prw-rw----  1 centreon-engine centreon-engine    0 22 sept. 05:50 centengine.cmd
-rw-r-----  1 root            root            4,0K 22 août  15:17 .centengine.cmd.swp
mathieuchateau commented 5 years ago

I have same issue since upgrade to 19.04. It was working for 1 year in 2.8 version.

Timezone is set correctly and same accross central and poller, and php:

[root@XXXXX etc]# ls -l localtime
lrwxrwxrwx. 1 root root 34  3 mai    2018 localtime -> ../usr/share/zoneinfo/Europe/Paris
[root@XXXXX etc]# cat /etc/opt/rh/rh-php71/php.d/php-timezone.ini
date.timezone = Europe/Paris

cbd is running

[root@XXXXX centreon]# ls -lah /var/lib/centreon-engine/rw/
total 4,0K
drwxrwxr-x. 2 centreon-engine centreon-engine   55  6 mai   13:41 .
drwxr-xr-x. 4 centreon-engine centreon-engine  112 24 sept. 11:04 ..
prw-rw----  1 centreon-engine centreon-engine    0 22 sept. 05:50 centengine.cmd
-rw-r-----  1 root            root            4,0K 22 août  15:17 .centengine.cmd.swp

They appeared after restarting cbd & centcore on central server. So they were stored and stuck in pipe

PingTheTux commented 4 years ago

Hello all,

Same problem here, i have this bug with only one of my 5 pollers. Any external command is working, i'm getting continuously this error message ( from /var/log/centreon/centcore.log )

image

The bug happnened without any modification or update.

Tried to give full right on the target directoy /var/lib/centreon-engine/ but nothing change As centreon user i can edit and create files on this directory withtout any problem. XML configuration are OK, checked them several times.

Monitoring is working well but cant execute any external command (immediat check, downtime ect ..) all results of the mentionned error message ..

I even updated the poller from Centreon Engine 19.04.1 to 19.04.2 (using simply yum update) but same problem.

Hope i wont have to rebuild the poller ..

Sims24 commented 4 years ago

@PingTheTux could you restart your poller and paste the result of the command below please:

[root@me ~]# grep '/usr/lib64/centreon-engine/externalcmd.so' /var/log/centreon-engine/centengine.log

It should return something like:

[1580994602] [28352] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully
[1580994604] [27673] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully

Also make sure that you have similar line within centreon-engine configuration file:

[root@me ~]# grep externalcmd.so /etc/centreon-engine/centengine.cfg
broker_module=/usr/lib64/centreon-engine/externalcmd.so
PingTheTux commented 4 years ago

Thanks for your interest to my problem.

root@poller2:~$ grep '/usr/lib64/centreon-engine/externalcmd.so' /var/log/centreon-engine/centengine.log

[1580988157] [880] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully [1580989514] [880] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully [1580989515] [8351] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully [1580989937] [8351] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully [1580989949] [886] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully [1580991703] [886] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully [1580991703] [10467] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully [1580991742] [10467] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully [1580991742] [10491] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully [1580991802] [10491] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully [1580991809] [10562] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully [1580992364] [10562] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully [1580992364] [13183] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully [1580994993] [13183] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully [1580994993] [25875] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully [1580995053] [25875] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' deinitialized successfully [1580995053] [25901] Event broker module '/usr/lib64/centreon-engine/externalcmd.so' initialized successfully

And here is the output for centreon-engine configuration

root@poller2:~$ grep externalcmd.so /etc/centreon-engine/centengine.cfg broker_module=/usr/lib64/centreon-engine/externalcmd.so

I just deleted the poller and imported it again but the problem is still the same

Sims24 commented 4 years ago

Do you confirm that /var/lib/centreon-engine/rw/centengine.cmd file exists with the following rights:

[root@cps-demo-central ~]# ls -lra /var/lib/centreon-engine/rw/
total 8
prw-rw----  1 centreon-engine centreon-engine    0  6 févr. 15:57 centengine.cmd
drwxr-xr-x. 5 centreon-engine centreon-engine 4096  6 févr. 15:41 ..
drwxrwxr-x. 2 centreon-engine centreon-engine 4096 13 janv. 14:24 .
PingTheTux commented 4 years ago

Yes

root@poller2:~$ ls -lra /var/lib/centreon-engine/rw/ total 0 prw-rw---- 1 centreon-engine centreon-engine 0 6 févr. 13:32 centengine.cmd drwxrwxr-x+ 5 centreon-engine centreon-engine 122 6 févr. 14:17 .. drwxrwxr-x 2 centreon-engine centreon-engine 28 6 févr. 13:32 .

I've even deleted (moved) this file "centengine.cmd" as it's regenerated from the central server. It's created again without any problem when i generate/export configuration from the Centreon GUI.

root@poller2:/var/log$ ls -la /var/lib/centreon-engine/rw/ total 0 drwxrwxr-x 2 centreon-engine centreon-engine 28 6 févr. 16:14 . drwxrwxr-x+ 5 centreon-engine centreon-engine 122 6 févr. 16:14 .. prw-rw---- 1 centreon-engine centreon-engine 0 6 févr. 16:14 centengine.cmd

Sims24 commented 4 years ago

It's the engine process that creates it at startup thanks to the externalcmd.so library ... But to be honest it sounds like an obvious thing we don't catch -_-

I'll be back to you as soon as a new idea comes up :/

PingTheTux commented 4 years ago

Thanks anyway. Feel free to ask if you need any further logs/data to invetigate. I've deployed a new poller and will add it to Centreon to determin the origin of this bug (faulty poller or central )

Sims24 commented 4 years ago

Actually it fails here https://github.com/centreon/centreon/blob/f5ec31e342ad29b6e66f31ee831e57d4f575b19a/lib/perl/centreon/script/centcore.pm#L521 so it's when the centcore process tries to push the command line.

Could you enable centcore debug and paste the log please? Maybe it could help

Simon

PingTheTux commented 4 years ago

Yes of corse, i'll post the logs tomorow morning.

PingTheTux commented 4 years ago

Hi again, As planned i've enabled centcore debug mode, here are some logs of two actions i tried. 1st : setting a downtime on FILESERVER1 2nd : executed an immediate check.

1ST PART 2020-02-07 09:24:25 - INFO - Enable Debug in Centcore 2020-02-07 09:24:35 - INFO - External command : 192.168.0.100 (20) : "[1581063874] SCHEDULE_HOST_DOWNTIME;FILESERVER1;1581063840;1581071040;1;0;7200;admin;Temps d arrêt fixé par admin [1581063874] SCHEDULE_HOST_SVC_DOWNTIME;FILESERVER1;1581063840;1581071040;1;0;7200;admin;Temps d arrêt fixé par admin " 2020-02-07 09:24:40 - INFO - Receiving die: Timeout by signal ALARM

2020-02-07 09:24:40 - INFO - Dont die... 2020-02-07 09:24:40 - INFO - Receiving die: Timeout by signal ALARM

2020-02-07 09:24:40 - INFO - Dont die... 2020-02-07 09:24:40 - INFO - Timeout by signal ALARM

2020-02-07 09:24:40 - INFO - Killing child process [84336] ... 2020-02-07 09:24:40 - INFO - Killed 2020-02-07 09:24:40 - ERROR - Could not write into pipe file /var/lib/centreon-engine/rw/centengine.cmd on poller 20

2ND PART :

2020-02-07 09:25:13 - INFO - External command : 192.168.0.100 (20) : "[1581063912] SCHEDULE_SVC_CHECK;FILESERVER1;CPU;1581063912 " 2020-02-07 09:25:18 - INFO - Receiving die: Timeout by signal ALARM

2020-02-07 09:25:18 - INFO - Dont die... 2020-02-07 09:25:18 - INFO - Receiving die: Timeout by signal ALARM

2020-02-07 09:25:18 - INFO - Dont die... 2020-02-07 09:25:18 - INFO - Timeout by signal ALARM

2020-02-07 09:25:18 - INFO - Killing child process [84386] ... 2020-02-07 09:25:18 - INFO - Killed 2020-02-07 09:25:18 - ERROR - Could not write into pipe file /var/lib/centreon-engine/rw/centengine.cmd on poller 20

Sims24 commented 4 years ago

Hi,

Does the ssh connection between the central and the poller is fast?

Maybe you can try to set the centcore timeout option to 15 through "Administration > Parameters > CentCore" UI.

Simon

PingTheTux commented 4 years ago

You found it ! at least in my case the problem was the timeout. I first set 15s, it did not work, then i entered 30s which worked perfectly !! i can now set downtime and execute immediate check without any problem.

image

To answer you about ssh connexion, the poller is in a different subnet, it takes about 5/6s to initiate the connexion, from now i'll take in consideration this point.

PingTheTux commented 4 years ago

Thank you very much for your assistance and your availability, i hope this will help others to solve their problem. I'm not the "issue opener" but i'm still available if you need any logs to compare with similar bug/problem. Have a good day :)

Sims24 commented 4 years ago

You're very welcome ;) Thanks for using Centreon