Open majduk opened 4 years ago
looks like it's not positive... why contrail-status works more than 10seconds? i saw when DNS doesn't work correctly - it can take 10 seconds.
All units report status OK. Also in Contrail WebUI there are no alarms. Nothing that could indicate an issue whatsoever.:
root@UPSR-BRBFD-01-0008:~# time contrail-status
Pod Service Original Name Original Version State Id Status
redis contrail-external-redis 1912-32 running 019aeb6e6daa Up 4 weeks
analytics api contrail-analytics-api 1912-32 running 4d0a54533805 Up 10 hours
analytics collector contrail-analytics-collector 1912-32 running 3241d7610545 Up 2 weeks
analytics nodemgr contrail-nodemgr 1912-32 running c886e6fd6372 Up 2 weeks
analytics-alarm alarm-gen contrail-analytics-alarm-gen 1912-32 running 52720331a81e Up 10 hours
analytics-alarm kafka contrail-external-kafka 1912-32 running 7924a3690608 Up 2 weeks
analytics-alarm nodemgr contrail-nodemgr 1912-32 running 903380a284c9 Up 2 weeks
analytics-snmp nodemgr contrail-nodemgr 1912-32 running 6bd26cbb1519 Up 2 weeks
analytics-snmp snmp-collector contrail-analytics-snmp-collector 1912-32 running 365f1eaa069a Up 10 hours
analytics-snmp topology contrail-analytics-snmp-topology 1912-32 running 2f44a7e7a5f5 Up 10 hours
config api contrail-controller-config-api 1912-32 running 9939958cd7df Up 2 weeks
config device-manager contrail-controller-config-devicemgr 1912-32 running 425b87ebe9f5 Up 10 hours
config nodemgr contrail-nodemgr 1912-32 running e98ec19c8ef4 Up 2 weeks
config schema contrail-controller-config-schema 1912-32 running 439f3fae4ed2 Up 10 hours
config svc-monitor contrail-controller-config-svcmonitor 1912-32 running 1cc003d6f15f Up 25 hours
config-database cassandra contrail-external-cassandra 1912-32 running 2b3178c6790b Up 2 weeks
config-database nodemgr contrail-nodemgr 1912-32 running e6841e7f1583 Up 2 weeks
config-database rabbitmq contrail-external-rabbitmq 1912-32 running af5f9f6eca17 Up 2 weeks
config-database zookeeper contrail-external-zookeeper 1912-32 running a796fc653dcd Up 2 weeks
control control contrail-controller-control-control 1912-32 running 4486f22ab836 Up 2 weeks
control dns contrail-controller-control-dns 1912-32 running f8a7fd180f71 Up 2 weeks
control named contrail-controller-control-named 1912-32 running a3e54ff46a44 Up 2 weeks
control nodemgr contrail-nodemgr 1912-32 running 89afad426951 Up 2 weeks
database cassandra contrail-external-cassandra 1912-32 running 513c47659981 Up 4 weeks
database nodemgr contrail-nodemgr 1912-32 running 39ef970fdf39 Up 4 weeks
database query-engine contrail-analytics-query-engine 1912-32 running ba1a23b5152b Up 4 weeks
webui job contrail-controller-webui-job 1912-32 running e90459c6c38c Up 2 weeks
webui web contrail-controller-webui-web 1912-32 running c1e80570872a Up 2 weeks
== Contrail control ==
control: active
nodemgr: active
named: active
dns: active
== Contrail analytics-alarm ==
nodemgr: active
kafka: active
alarm-gen: active
== Contrail database ==
nodemgr: active
query-engine: active
cassandra: active
== Contrail analytics ==
nodemgr: active
api: active
collector: active
== Contrail config-database ==
nodemgr: active
zookeeper: active <### --- pauses here for 5 sec --- ###>
rabbitmq: active
cassandra: active
== Contrail webui ==
web: active
job: active
== Contrail analytics-snmp ==
snmp-collector: active
nodemgr: active
topology: active
== Contrail config ==
svc-monitor: active
nodemgr: active
device-manager: active
api: active
schema: backup
real 0m10.792s
user 0m0.064s
sys 0m0.046s
ok, got it. but honestly I don't why zookeeper is so slow in your env. and I can't say how to increase nrpe check timeout right now (and i don't think that it's a good way)
I've already increaset the timeout.
contrail-status takes pretty much the same amount of time in all envs where we have 19.XX release.
deleted my old comments, because those were pointing to some other problem which is not related to this issue. thanks
Due to a fact that contrail-status runs for over 10s, contrail nagios check times out:
This leads to a false positive alarm in nagios.