elkowar / eww

ElKowars wacky widgets
https://elkowar.github.io/eww
MIT License
9.3k stars 382 forks source link

[BUG] Eww spawns zombie windows and processes #255

Open Axarva opened 3 years ago

Axarva commented 3 years ago

Checklist before submitting an issue

Describe the bug

Basically, launching widgets will open your widgets, but sometimes they'll spawn, but refuse to close. Trying eww kill will kill the daemon, but these windows will not be closed. killall eww, however, works.

As far as I can predict, this issue happens after making eww daemon not a dependency for launching eww widgets. I'm not quite sure though.

Reproducing the issue

Use any config and create a widget. Bind the opening of your window to a keybind. Do not launch eww daemon. Press your keybind. Eww will not open windows immediately which is expected.

Now, before eww has finished opening the window, press the keybind again. Try to close this window with the eww close command. It will not work. The only way to remove the window will be killall eww.

This also happens sometimes randomly even when the daemon is running, but I cannot consistently reproduce it.

Expected behaviour

Eww windows to close with the eww close or eww kill commands.

Additional context

Logs: image

The log says Closing all windows. However, windows aren't closed.

elkowar commented 3 years ago

So what's happening here is that you're sending a close window command before the window has been fully opened? Am I getting that right?

Axarva commented 3 years ago

Nope, an open window command before the windows are fully opened.

elkowar commented 3 years ago

could you share the output of eww debug and eww windows once you encounter an instance of such a zombie window (window that doesn't close)?

VuiMuich commented 3 years ago

I have experienced this as well, when restarting eww due to a WM reload. Also sometimes these windows seem to have odd properties and size, not sure if this might be a WM issue. Will also try to catch some logs and maybe xprops, when I get this next time.

Axarva commented 3 years ago

Here is the output of eww windows:

image

And here's the output of eww debug.

elkowar commented 3 years ago

Interesting, this means that it actually sees the windows as not open, and thus won't close them. I'll investigate

Animeshz commented 3 years ago

In my case it does show the window as open.

I'd guess that since the ipc server runs at the last of server::initialize_server which might have took some time, two instances of the main.rs saw that daemon is not running and started running that function and two ipc_server listeners may attempt to listen one might got disconnected cuz of same file to listen.

Animeshz commented 3 years ago

Actually problem even gets increased due to 5 connection retries, this makes the startup of the server very slow as well as chances of this bug to happen. I'd suggest to reduce that to 3 or 2.

I can see there are two ways for solving this:

  1. The wm I use (herbstluftwm) offer a --locked option which prevent it to draw anything on screen until hc unlock is executed in the autostart script (preferably at the end), similar to that if we can lock the socket file somehow at the start of initialize_server or something. Or maybe make a lock file which indicates some instance already have it opened while not started receiving commands (strategy used by vmware etc).
  2. Revoke open command to attempt to start server to open. Make something like mkdir -p for eww daemon which silently start the daemon if does not exist, or do nothing. And force user to do eww -p daemon && eww open <bar> instead (saves from 5 connection attempts, making startup faster).

Edit: Hybrid of two, have two locks a client side a server side, client can't connect till any of client or server lock is present, server cannot attempt to check daemon running or not until server side lock is not released. This will prevent from making breaking change from completely changing default behavior of open command to start the daemon if not running.

elkowar commented 3 years ago

well, option 2 is what we had before - it made usage a lot more confusing, and generally more anoying to deal with when developing, too. Having a better check such as a file based lock would work, however it runs the issue of not getting cleaned up correctly - some people do do pkill eww, which then, well, doesn't necessarily give eww the chance to clean up everything.

If you care about the startup speed, then do start the daemon manually before, just as you showed - there the daemon won't do it's 5 connection attempts, and thus will be a lot faster to start up.

Having some better way of making sure no current server is already running is definitely something worth looking at again - having multiple eww instances running from the same config path is definitely an issue that shouldn't happen

Animeshz commented 3 years ago

Can we just move the init_async_part at the top of server::initialize_server and handle any error to stop the thread handle (instead of ? at end of result)? And reduce connection attempts to 3? Might reduce the chance of this happening to significant extent, but not free from this completely...

Also a regain of lock is something what vmware allows, if a vm is not running & lock is present (I'm not exactly sure how they actually do this).

VuiMuich commented 3 years ago

On a side note: Using pkill eww currently is the 'clenaer' experience for me as doing eww kill. When I use eww kill I regularly end up with one of the three windows I use as my 'bar' being hidden after a WM reload (doing this quite often at the moment as I am working on a few different contributions to leftwm, and checking bug reports). TBH I am not 100% sure this issue is really eww's fault, or is a regression introduced to leftwm actually. Will do further tests, if I can pin it more down and blame one or the other.. Window eww - tags should be in the center: 2021-09-15-212951_2560x1440_scrot Killing with pkill and manually reloading leftwm theme: 2021-09-15-213024_2560x1440_scrot xprops for comparison: 2021-09-15-213042_2560x1440_scrot

Edit2: jsut did a vimdiff on the two xprops outputs, and besides the obvious differences like PID, USER_TIME and SYNC_REQUEST they are identical

Animeshz commented 3 years ago

I had a random thought, can we just apply a lock/mutex for eww [open|open-many|daemon|kill|reload] commands from the start line of main() to the end of main? with a timeout so that subsequent calls to them require to wait until lock is freed or timeout say 5s has been elapsed?

Simple to implement as well as might be free from this unexpected behavior, as the second open command will require to finish old open command if started parallelly 🤔🤔

Animeshz commented 3 years ago

Random thought came to my mind because eww open bar && eww open bar never had this problem i.e. when one waits for the other to finish. Only when launching parallelly this will occur, which we can fix using a virtual lock file.

VuiMuich commented 3 years ago

Random thought came to my mind because eww open bar && eww open bar never had this problem i.e. when one waits for the other to finish. Only when launching parallelly this will occur, which we can fix using a virtual lock file.

You mean eww daemon && eww open bar? There where also issues, when one tried to open windows before the daemon fully spun up. Therefore I had a script that executed eww open .. until eww window | grep \* returned something.

Animeshz commented 3 years ago

Oh right, it forks the process and returns the parent before complete initialization 🤔🤔. But we can still pass the responsibility to the child fork to unlock the lock for these server-commands, then let the next command to proceed... That will solve the issue if I'm not wrong 👀

VuiMuich commented 3 years ago

So have a initializing_lock and the child has to wait before returning until the lock is gone as init finished?

rajoayalew commented 2 years ago

It seems like I have a similar issue based on what has been mentioned in the thread although there hasn't been a new update to this issue since September. Has there been an update to this issue or is there a workaround? Thanks

tkapias commented 1 year ago

I may have an issue related to this one too.

  1. I run the daemon at startup.
  2. I have an i3wm key binding to toggle the main windows (eww open bar) or close-all (it check for 'eww state' content to choose).

It's hard to debug this because it can happen 1 or 5h after daemon launch. I don't know what's the trigger. Maybe xscreensaver?