fvwmorg / fvwm3

FVWM version 3 -- the successor to fvwm2
Other
505 stars 79 forks source link

FvmwMFL did not get started because of wrong PID in pid file #1054

Closed farblos closed 3 weeks ago

farblos commented 1 month ago

Thanks for reporting your bug here! The following template will help with giving as much information as possible so that it's easier to diagnose and fix.

Upfront Information

Please provide the following information by running the command and providing the output.

[~]$ fvwm3 --version
fvwm3 1.0.6a (released)
with support for:  ReadLine, XPM, PNG, SVG, Shape, XShm, SM, Bidi text, XRandR, XRender, XCursor, XFT, NLS

fvwm3 comes with NO WARRANTY, to the extent permitted by law. You may
redistribute copies of fvwm under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.

Debian testing

[~]$ uname -sp
Linux unknown

Expected Behaviour

Module FvwmMFL gets started during FVWM startup. My startup function:

AddToFunc StartFunction
+ I Echo StartFunction
+ I Module FvwmEvent
+ I Module FvwmMFL

Actual Behaviour

Module FvwmMFL did not start. This one time, I have never seen that previously. And I could not reproduce it, so I can only provide post-mortem information.

In my journal I see these messages here (note that FVWM runs in debug mode and logs to the journal):

Jul 23 10:24:48 host01 systemd[1869]: Started fvwm3.scope - /usr/bin/fvwm3 -v -o -.
Jul 23 10:24:48 host01 fvwm3[1889]: [1721723088.820074] FScreenInit: Using RandR 1.6
Jul 23 10:24:48 host01 fvwm3[1889]: [1721723088.946628] main: Loading window states via (null)
Jul 23 10:24:48 host01 fvwm3[1889]: [1721723088.961279] parse_and_set_window_style: Bad style option: Color lightgrey/dimgrey
Jul 23 10:24:48 host01 fvwm3[1889]: [1721723088.961301] parse_and_set_window_style: Bad style option: HilightFore black
Jul 23 10:24:48 host01 fvwm3[1889]: [1721723088.961303] parse_and_set_window_style: Bad style option:  HilightBack grey
Jul 23 10:24:48 host01 fvwm3[1889]: loaded [0]: /home/farblos/.fvwm/config
Jul 23 10:24:48 host01 fvwm3[1889]: [1721723088.973442] CMD_Echo: StartFunction
Jul 23 10:24:48 host01 fvwm3[1971]: FvwmMFL is already running

So the module thinks it is already running. Its pid file at that time looked like this:

[~]$ find . -iname '*fvwm*mfl*'
./.fvwm/fvwm_mfl_:0.pid
[~]$ ls -al ./.fvwm/fvwm_mfl_:0.pid
-rw-r--r-- 1 farblos farblos 4 May 21 10:20 ./.fvwm/fvwm_mfl_:0.pid
[~]$ cat ./.fvwm/fvwm_mfl_:0.pid
1889

What I do not understand:

I also tried restarting FVWM (from within FVWM), but that didn't help since the newly started FVWM process obviously inherits the PID of the old one:

Jul 23 11:03:23 host01 fvwm3[1889]: [1721725403.952478] FScreenInit: Using RandR 1.6
Jul 23 11:03:24 host01 fvwm3[1889]: [1721725404.097550] main: Loading window states via /home/farblos/.fvwm/.fs-restart-host01:0
Jul 23 11:03:24 host01 fvwm3[1889]: [1721725404.118143] parse_and_set_window_style: Bad style option: Color lightgrey/dimgrey
Jul 23 11:03:24 host01 fvwm3[1889]: [1721725404.118168] parse_and_set_window_style: Bad style option: HilightFore black
Jul 23 11:03:24 host01 fvwm3[1889]: [1721725404.118171] parse_and_set_window_style: Bad style option:  HilightBack grey
Jul 23 11:03:24 host01 fvwm3[1889]: loaded [0]: /home/farblos/.fvwm/config
Jul 23 11:03:24 host01 fvwm3[1889]: [1721725404.139117] CMD_Echo: StartFunction
Jul 23 11:03:24 host01 fvwm3[5340]: FvwmMFL is already running

Only after manually removing the PID file and then restarting FVWM would the FvwmMFL module start.

Some funny race condition during creation of the PID file? No further ideas here. If you don't have any ideas, either, feel free to close this issue as "not reproducible" ...

Thanks!

ThomasAdam commented 1 month ago

Hi,

Thanks. I'm already aware of this. It's because of PID wrap-arounds.

farblos commented 1 month ago

Thanks for your quick reply. Out of curiosity:

What do you mean by "PID wrap-arounds"? That the PIDs generated by the kernel overflow? On my system I have:

[~]$ cat /proc/sys/kernel/pid_max
4194304

And from the logs:

Jul 23 10:24:48 host01 fvwm3[1889]: [1721723088.973442] CMD_Echo: StartFunction
Jul 23 10:24:48 host01 fvwm3[1971]: FvwmMFL is already running

I'd think that FVWM had PID 1889, while FvwmMFL had PID 1971?

Which doesn't seem to match a "classical PID wrap-around" as described e.g. here.

Update: Fixed typo.

ThomasAdam commented 3 weeks ago

Hi @farblos

Hopefully, PR #1057 fixes this issue for you.

farblos commented 3 weeks ago

Thanks a lot. Unfortunately, I hit this issue exactly once in 2 years of using FVWM 3 and I have not been able to reproduce it artificially, either. So the most reasonable thing seemed to be a review, which I did. Feel free to close this issue at your discretion.

somiaj commented 3 weeks ago

Yea, with rolling PIDs can be a bit random if it happens or not, but a better locking mechanism can still help out.