Open HideoYamauchi opened 4 years ago
You could try setting verbose=yes
to see if you can track down what exactly causes the issue (will be explained in the agent's metadata if it's supported on your installed version of the agent).
Hi Oyvind,
Our environment is RHEL8.0, and fence_scsi seems to support verbose=yes.
I tried to set verbose=yes in the fence_scsi parameter, but it seems that information is not output especially to pacemaker.log. Is the information output to other places?
Best Regards, Hideo Yamauchi.
It might also be in corosync.log or /var/log/messages.
If you try to run it manually though it should be shown on your screen immediately.
Hi Oyvind,
Thanks for your comment. I'll give it a try.
Best Regards, Hideo Yamauchi.
Hi Oyvind,
Since the specification of the verbose option cannot be performed well, I forcibly changed the code of fence_scsi and enabled and executed the verbose, but it did not seem to get much useful information.
(snip)
def scsi_check(hardreboot=False):
if len(sys.argv) >= 3 and sys.argv[1] == "repair":
return int(sys.argv[2])
options = {}
options["--sg_turs-path"] = "/usr/bin/sg_turs"
options["--sg_persist-path"] = "/usr/bin/sg_persist"
options["--power-timeout"] = "5"
options["retry"] = "0"
options["retry-sleep"] = "1"
options = scsi_check_get_options(options)
# if "verbose" in options and options["verbose"] == "yes":
logging.getLogger().setLevel(logging.DEBUG)
(snip)
[root@rh80-02 ~]# /etc/watchdog.d/fence_scsi_check_hardreboot test
INFO:root:Executing: /usr/bin/sg_turs /dev/sdb
DEBUG:root:0
INFO:root:Executing: /usr/bin/sg_persist -n -i -k -d /dev/sdb
DEBUG:root:0 PR generation=0x5fb3, 8 registered reservation keys follow:
0x5e2a0001
0x5e2a0001
0x5e2a0001
0x5e2a0001
0x5e2a0000
0x5e2a0000
0x5e2a0000
0x5e2a0000
DEBUG:root:key 5e2a0001 registered with device /dev/sdb
Also, it seems that the same high CPU load occurs when using the watchdog service with fence_mpath.
I will investigate the cause a little more.
Best Regards, Hideo Yamauchi.
Maybe there's some watchdog setting for tuning priority of the process?
Hi Oyvind,
Maybe there's some watchdog setting for tuning priority of the process?
Yes.
In the environment in question, the default settings in /etc/watchdog.conf are as follows:
(snip)
# This greatly decreases the chance that watchdog won't be scheduled before
# your machine is really loaded
realtime = yes
priority = 1
(snip)
Many thanks, Hideo Yamauchi.
I would try changing the priority to see if that helps.
Hi Oyvind,
I would try changing the priority to see if that helps.
I'll give it a try....
But...
I changed the priority to 50 or 99 and restarted the watchdog service, but it seems that the CPU usage of fence_scsi_check_hardreboot does not change.
It seems that you can confirm that the CPU usage rises simply by the following command line.
/usr/libexec/platform-python -c 'import sys;sys.path.append("/usr/share/fence");import fencing'
I think this improvement seems to be difficult for python import processing.
Best Regards, Hideo Yamauchi.
Yeah. I dont know how we can improve that.
Hi Oyvind,
I think a little more about improvement.
It may be the right conclusion that this improvement is difficult in Python. In that case, you will need to dedicate more CPU resources to virtual machines and so on.
Best Regards, Hideo Yamauchi.
Hi All,
Configure a cluster using fence_scsi in a virtual environment to which only one CPU core is allocated.
When fence_scsi_check_hardreboot is used together with the watchdog service to configure the pacemaker cluster, fence_scsi_check_hardreboot uses 20% of the CPU every second.
When this happens, pacemaker frequently outputs the following log.
Some improvement can be achieved by increasing the number of CPU cores or increasing the monitoring interval of the watchdog service. However, some users may not be able to change core assignments. Increasing the monitoring interval also affects the failover time when a failure occurs.
Is there any way to improve the fence_scsi_check_hardreboot script to solve the problem? (Can make the processing of fence_scsi_check_hardreboot a little lighter?)
Best Regards, Hideo Yamauchi.