notanumber / xapian-haystack

A Xapian backend for Haystack
GNU General Public License v2.0
154 stars 93 forks source link

Fix multiple workers from breaking #218

Closed ajslater closed 2 years ago

ajslater commented 2 years ago

Problem

If xapian-haystack is used today with more than one worker it breaks with xapian.DatabaseLockError

Solution

This solution was cribbed from @karolyi in his issue report: https://github.com/notanumber/xapian-haystack/issues/174 It implements a lockfile around the update() and remove methods.

Decisions

It is compatible with python2 and doesn't use pathlib.

It makes the opinionated decision to place a lockfile in the xapian-index directory. There are various other places where its good to put lockfiles on different systems, but as this lockfile will not be accessed very often, placing in the regular filesystem on disk seems fine and less platform specific code is required to place the lockfile somewhere special.

The renaming of the original functions to private versions was done to reduce the size of this PR. Other solutions are possible.

Workarounds

This PR can be used as inspiration for an easy workaround for this issue. Subclass XapianSearchBackend and BaseEngine with these changes. and use that in your Haystack settings. Example here: https://github.com/ajslater/codex/blob/develop/codex/search_engine.py

claudep commented 2 years ago

Thanks for taking the initiative to create a pull request.

About the file lock implementation, I wonder if we shouldn't rather depend on filelock (https://pypi.org/project/filelock/), to avoid introducing subtle bugs. filelock seems to be well-maintained.

Ideally, this should have a test. I know writing tests for concurrency situations is not the most pleasant thing to do…

ajslater commented 2 years ago

Your filelock suggestion was good. I've implemented it and tested it in my application that overrides the xapian SearchEngine. It also reduces the size of this PR by quite a bit.

I'm currently trying to copy the django-haystack whoosh management commands test suite for xapian-haystack.

karolyi commented 2 years ago

This is a lot of commits for a pull request. You might want to squash them before merging into master.

ajslater commented 2 years ago

Yep. I'm going to close this PR and open a new one that's set up better on a branch. Didn't realize when I started what a project this would become and that I'd have to run tests on github's infra.

karolyi commented 2 years ago

Just for clarity's sake, it wasn't necessary to close it.

You could have squashed your commits locally and then force push into your remote/branch.

ajslater commented 2 years ago

I made a mistake by staging the PR from a forked master branch rather than a feature branch, so I just redid everything.

The solution I have using filelock decorators works well in the application I need this for, I'm using so I'm confident this is a decent general approach, and the existing tests pass now.

But as per your recommendation I'm going to try to get some management command tests working before resubmitting this PR. You won't be barraged with notifications until I have something like that. Should be soon.

I'd prefer to make the lock on update() smaller and confined to just the document update portion at the end, but xapian.TermGenerator requires a WritableDatabase so that doesn't seem possible without deeper knowledge of what's going on.

On Fri, Feb 4, 2022 at 12:39 AM László Károlyi @.***> wrote:

Just for clarity's sake, it wasn't necessary to close it.

You could have squashed your commits locally and then force push into your remote/branch.

— Reply to this email directly, view it on GitHub https://github.com/notanumber/xapian-haystack/pull/218#issuecomment-1029762640, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACAKRQD662QHEGJT2R2TE3UZOGCTANCNFSM5NHLPDHA . You are receiving this because you modified the open/close state.Message ID: @.***>