mchaput / whoosh

Pure-Python full-text search library
Other
569 stars 69 forks source link

AsyncWriter leaves threads unjoined, may lose writes #14

Open asedeno opened 3 years ago

asedeno commented 3 years ago

As discussed at https://github.com/django-haystack/django-haystack/issues/1792, AsyncWriter.commit()'s thread codepath leaves a thread running and not join()ed and gives no warning to the caller that they should join() it. This can cause writes to go unwritten if the process exits before AsyncWriter can get the lock, and I suspect may leave things in an inconsistent state if it does get the lock but the process exits before the thread finishes writing.

I understand that adding a join() in AsyncWriter.commit() would defeat the purpose of it being fire and forget, but having it be fire and maybe never write is not good behavior. Detecting whether or not AsyncWriter needs to be join()ed currently requires digging into its internals. Perhaps some additional public methods and documentation are in order?