epics-modules / xxx

APS BCDA synApps module: xxx
http://epics-modules.github.io/xxx
Other
5 stars 6 forks source link

xxx.sh fails to detect IOC running with same user but different group #52

Closed kmpeters closed 2 years ago

kmpeters commented 2 years ago

The problem can be reproduced by doing the following:

$ xxx.sh start
$ xxx.sh status
xxx is running (pid=2778342) in procServ (pid=2778341)
$ newgrp users
$ xxx.sh status
xxx is not running

A second copy of the IOC will be started if xxx.sh start is called before exit is used to return to the shell with the default group.

kmpeters commented 2 years ago

The problem is this readlink call: https://github.com/epics-modules/xxx/blob/52d733d293dde4e424530a40f16382540f8598e1/iocBoot/iocxxx/softioc/xxx.sh#L192 The readlink call succeeds when the user's primary group matches the primary group at the time the IOC was started:

$ readlink /proc/2778342/cwd
/home/user/epics/ioc/xxx/iocBoot/iocxxx
$ echo $?
0

The readlink call fails when the user's primary group doesn't match the primary group at the time the IOC was started:

$ readlink /proc/2778342/cwd
$ echo $?
1
kmpeters commented 2 years ago

A workaround could be to use the stat command to detect when the IOC is running as the same user but a different primary group:

$ stat -c "%U"
$ stat -c "%G"

However this would prevent multiple users from running IOCs that share a binary (areaDetector IOCs) on a single computer.

kmpeters commented 2 years ago

@keenanlang, the new PID check in commit 7z81d71 appears to break detection of an IOC on RHEL7:

[user@host ~]$ /net/s34dserv/xorApps/epics/synApps_6_2_1/ioc/34ideMCS/iocBoot/ioc34ideMCS/softioc/34ideMCS.sh status
IOC_PID=14097
IOC_STARTUP_DIR=/net/s34dserv/xorApps/epics/synApps_6_2_1/ioc/34ideMCS/iocBoot/ioc34ideMCS/softioc/..
34ideMCS is not running
[user@host ~]$ ps aux | grep 14097
user 14097  0.6  0.4 3299128 32996 pts/3   Ssl+ 15:17   0:02 ../../bin/linux-x86_64/34ideMCS st.cmd.Linux
user 14774  0.0  0.0 112820   988 pts/4    S+   15:25   0:00 grep --color=auto 14097
[user@host ~]$ 
keenanlang commented 2 years ago

Looks like it fails if 'run' is used instead of 'start', fixed in 652d95b

kmpeters commented 2 years ago

Thanks!