purebred-mua / purebred

A terminal based mail user agent based on notmuch
GNU Affero General Public License v3.0
139 stars 19 forks source link

UI freeze on GHC 9.2 on some operations (notmuch related) #468

Closed frasertweedale closed 2 years ago

frasertweedale commented 2 years ago

Describe the bug

Some operations lock the UI. For example, using the database generated from UAT test data, perform the actions from testUserCanMoveBetweenThreads. That is:

If these actions are performed SLOWLY (say a 1 second interval) it can be done over and over and everything works. If these actions are performed QUICKLY, the UI instantly locks.

During the lock, it is observed that the purebred has forked:

% pgrep -f -l purebred
75692 purebred --database /tmp/mail/Maildir
75600 purebred --database /tmp/mail/Maildir

kill -9 <child-pid> unblocks the UI and reveals an error message:

A Xapian exception occurred opening database:
  Unable to get write lock on /tmp/mail/Maildir/.notmuch/xapian:
    Got EOF reading from child process

Analysis

Reading of Xapian source code shows that the "FlintLock" facility is used to get an exclusive (write) lock on the database. The implementation forks and the child uses fcntl(lockfd, F_SETLK, fl) to acquire the lock. Here is where it gets complicated and my guess as to what is happening:

This error did not occur before GHC 9.2, so it is probably a GC change that triggers the bug. The bug was always present and this seems to be a "how did this ever work" scenario.

Proposed solution