notanumber / xapian-haystack

A Xapian backend for Haystack
GNU General Public License v2.0
154 stars 93 forks source link

Management Command Database Locking #219

Closed ajslater closed 2 years ago

ajslater commented 2 years ago

Problem

If xapian-haystack management commands are used today with more than one worker it breaks with xapian.DatabaseLockError

Solution

This solution was cribbed from @karolyi in his issue report: #174
. I implement a lockfile decorator around the update() and remove() methods.

Decisions

It paces the lockfile in the xapian-index directory, which should not affect performance.

Tests

This PR includes a new test_management_commands.py test suite that runs the management commands. This test suite fails if the lockfile isn't implemented as you can see in this branch: https://github.com/ajslater/xapian-haystack/actions/runs/1798091039

Workarounds

This PR can be used as inspiration for an easy workaround for this issue. Subclass XapianSearchBackend and BaseEngine with these changes. and use that in your Haystack settings. Example here: https://github.com/ajslater/codex/blob/develop/codex/search_engine.py

claudep commented 2 years ago

I think we should still add a new custom connection option to opt out of the lock, so people not needing a multiprocess access have a way to keep better performances by not using the lock.

claudep commented 2 years ago

By the way, the minimal Python version is now 3.7. https://github.com/notanumber/xapian-haystack/commit/77d1fc8ef650ccd4dcd674611b99bc8ac2886a7c

ajslater commented 2 years ago

I've fixed or commented on all the requested changes. Take another look. An addition I've made is that I noticed that backend.path can be an in-memory database so i've disabled the filelock in that case.

I can't find the comment, but I recall you asking if we might provide an option for people not to use filelocking in general. Should I add a boolean flag to connection_options?

claudep commented 2 years ago

I can't find the comment, but I recall you asking if we might provide an option for people not to use filelocking in general. Should I add a boolean flag to connection_options?

Yes, please.

ajslater commented 2 years ago

Latest commits include the USE_LOCKFILE connection parameter with a short documentation line.

coveralls commented 2 years ago

Coverage Status

Coverage decreased (-0.02%) to 97.558% when pulling 56334253d85818510acf3afb5253975d2ef95819 on ajslater:filelock-management-tests into 3fc4cfe46dcff0b2f7f1880b03f76a0a6156ecfd on notanumber:master.

claudep commented 2 years ago

Sorry, I merged another pull request which forces you to rebase. Also, would you mind adding a line in the CHANGELOG?

ajslater commented 2 years ago

Merged from upstream and and added a CHANGELOG item.

claudep commented 2 years ago

Thanks, I think this is now ready. Only rebase and we are done.

ajslater commented 2 years ago

Rebased on current master.

claudep commented 2 years ago

Thanks a lot for your work!

ajslater commented 2 years ago

Nice, looking forward to the release. Thanks for enforcing quality on this project!