ActivityWatch / aw-qt

Tray icon that manages ActivityWatch processes, built with Qt.
Mozilla Public License 2.0
27 stars 29 forks source link

When aw-qt crashes, aw-watcher-afk does not shut down #19

Closed johan-bjareholt closed 5 years ago

johan-bjareholt commented 7 years ago

Really odd behavior. Reproducible by starting aw-qt and then doing "kill -9" on the aw-qt process.

ErikBjare commented 7 years ago

Yeah, I tried using a process group for this but I'm not that familiar with that part of POSIX.

Do you think you could look into that @johan-bjareholt?

ErikBjare commented 7 years ago

It seems that kill -9 doesn't kill the whole process group. You'd have to do kill -9 -$PGID. (See here: https://stackoverflow.com/a/15139734/965332)

A solution to prevent double-running processes would be to run a process check upon start and see if any aw-* processes (that aw-qt wants to manage) are already running.

Also, from Wikipedia:

The distribution of signals to process groups forms the basis of job control employed by shell programs. The tty device driver incorporates a notion of a foreground process group, to which it sends signals generated by keyboard interrupts, notably SIGINT ("interrupt", Control+C), SIGTSTP ("terminal stop", Control+Z), and SIGQUIT ("quit", Control+).

SIGKILL is notably missing from that list.

johan-bjareholt commented 7 years ago

A solution to prevent double-running processes would be to run a process check upon start and see if any aw-* processes (that aw-qt wants to manage) are already running.

Here's a proposal for this

https://github.com/ActivityWatch/aw-client/pull/21

ErikBjare commented 7 years ago

Closing this since it only happens in the case of kill -9, which we can't do anything about.

johan-bjareholt commented 7 years ago

No, it does not only happen for kill -9, i only used kill when i tried to reproduce. Another way to reproduce on my machine is to simply press "open log folder" so it crashes normally, but in that case all modules continue running (aw-server, aw-watcher-afk and aw-watcher-window).

ErikBjare commented 7 years ago

Oh, fair enough.

I think this depends on how Python crashes, if Python crashes with an internal error (possible that PyQt triggers this) instead of a normal Exception-caused crash then I'm not sure we can do much about it. If it crashes "normally" then I think the atexit should work (which I added here).

nikanar commented 7 years ago

Once aw-qt dies, the other aw-* become orphan processes, and their PPID becomes 1 (for the init process, rather than having the value of their parent's PID (which in our case is also their PGID). Since there is no formal tie between a child process and its parent), it is normal, if undesirable, that children processes stay alive after the parent crashes. Children that would rather die than stay as orphans typically poll their parents for aliveness, for instance checking whether their own PPID ever becomes 1 (and then shutting down).

This can be done in this way in Python. ``` # inspired from http://www.programcreek.com/python/example/4464/os.getppid while True: # watcher loop, run() or heatbeat_loop() if os.getppid() == 1: sys.exit() ``` This kills the watchers when `aw-qt` is `kill -9`-ed. Then I've been looking for the proper way to terminate the watchers, and while `aw-qt.Module.stop` looked nice, I couldn't access it, so I elected to just `break` out of the main loops. If there's a better thing to do, please let me know as I didn't find one. PR will come in as soon as I figure a proper git way out of my local mess (I've been adding `--user` to all Makefiles here, and `--upgrade --force-reinstall` to `aw-watcher-window` or it wouldn't use newer code (maybe due to the `make clean` issue ?)).
Some other solution using prctl [Another solution](https://stackoverflow.com/questions/284325/how-to-make-child-process-die-after-parent-exits) involves **prctl** : `prctl(PR_SET_PDEATHSIG, SIGTERM);`, but that seems [easier in C than in Python](https://stackoverflow.com/questions/38609686/terminate-a-forked-process-in-python-when-the-parent-process-dies-using-prctl). [**python-prctl** is a thing](https://github.com/seveas/python-prctl), if someone wants to go that way.
And this track was yet another option but led nowhere Otherwise, per section 10.6.4 in [this document](http://www.informit.com/articles/article.aspx?p=397655&seqNum=6), supposedly all *orphaned process groups* receive a **`SIGHUP`**, and could take that hint to die graciously (**but it doesn't seem to apply here**, maybe because [reasons](https://stackoverflow.com/questions/19756633/why-sighup-signal-not-received-when-becoming-an-orphaned-process-group?rq=1) ; [this question](https://stackoverflow.com/questions/39103871/daemonization-and-sighup) is super relevant but gets complicated). [That would take](https://stackoverflow.com/questions/21929690/what-are-the-consequences-of-killing-a-python-script-with-sighup) to [install a SIGHUP handler](https://docs.python.org/3/library/signal.html). That would have simply entailed adding `signal.signal(signal.SIGHUP, signal.SIGTERM)` inside `heartbeat_loop` in `aw-watcher-window/main.py`.

Note that even if all I did works fine, aw-server still stays alive when aw-qt crashes, which I think is still an issue. (But that seemed like an entirely different beast, to be tamed another day.)

johan-bjareholt commented 7 years ago

@nikanar Awesome! Much better solution than my suggestion.

Out of the 3 solutions you found the one you selected was clearly most clean.

Will also work nicely even if the watchers are not run with aw-qt.

Will check the PRs and merge :)

Thanks for the help!

ErikBjare commented 7 years ago

Amazing work, learned a lot from reading those links.

But, as I mentioned in comments on the PRs, there might be an issue with the taken approach when run using the PyInstaller binaries. I'm confident we can find a way around that however.

ErikBjare commented 7 years ago

Continued discussions should probably go here.

So, @nikanar did a nice fix with the PPID detection but I don't think that'll work with PyInstaller due to it using a bootloader process that in turn starts the application process.

Details here: https://pythonhosted.org/PyInstaller/advanced-topics.html#the-bootstrap-process-in-detail

The process tree would look something like this when run using the PyInstaller (iirc):

As specified in the PyInstaller docs, the bootloader will exit when its child process exits.

There has been talk about moving the child processes into Python-managed processes or threads (this would get rid of all the secondary bootloader and is likely to save some memory), but this has not been investigated.

nikanar commented 7 years ago

If that's right, then when running under PyInstaller, the children should look for the PPID of their parent process (the PyInterpreter) to see if that is 1 because aw-qt died. But then, I also have no idea if PPIDs work the same under Windows either.