Closed diwu1989 closed 8 years ago
The lock is a very good idea. You've also probably hit the nail on the head with an issue we had at Sparrho for a bit related to indexing very large chunks of content. It would take a long while. We never got the time to investigate the issue.
Concerning the timer, how would you do that?
On Sun, Dec 20, 2015, 22:24 Di Wu notifications@github.com wrote:
The concept of the buffer is to optimize indexing performance by utilizing bulk operations, but there's no guarantee that the buffer gets flushed in a timely fashion.
I think it should be necessary to use a timer with a configurable flush interval to make sure that things don't stay in the buffer indefinitely.
Another thing I noticed is that the buffer size check, bulk updating, and buffer emptying code isn't thread safe, if we have Django running with Gunicorn thread worker, then we have race condition.
There should be a Lock used to prevent any race conditions
— Reply to this email directly or view it on GitHub https://github.com/ChristopherRabotin/bungiesearch/issues/139.
I think I'll submit another PR to try to deal with the flush issue and then there should be enough bugs fixed to qualify for a new release
The concept of the buffer is to optimize indexing performance by utilizing bulk operations, but there's no guarantee that the buffer gets flushed in a timely fashion.
I think it should be necessary to use a timer with a configurable flush interval to make sure that things don't stay in the buffer indefinitely.
Another thing I noticed is that the buffer size check, bulk updating, and buffer emptying code isn't thread safe, if we have Django running with Gunicorn thread worker, then we have race condition.
There should be a Lock used to prevent any race conditions