ChristopherRabotin / bungiesearch

UNMAINTAINED CODE -- Elasticsearch-dsl-py django wrapper with mapping generator
BSD 3-Clause "New" or "Revised" License
67 stars 20 forks source link

SearchSignal buffer may pool items indefinitely and does not seem to be thread safe #139

Closed diwu1989 closed 8 years ago

diwu1989 commented 8 years ago

The concept of the buffer is to optimize indexing performance by utilizing bulk operations, but there's no guarantee that the buffer gets flushed in a timely fashion.

I think it should be necessary to use a timer with a configurable flush interval to make sure that things don't stay in the buffer indefinitely.

Another thing I noticed is that the buffer size check, bulk updating, and buffer emptying code isn't thread safe, if we have Django running with Gunicorn thread worker, then we have race condition.

There should be a Lock used to prevent any race conditions

ChristopherRabotin commented 8 years ago

The lock is a very good idea. You've also probably hit the nail on the head with an issue we had at Sparrho for a bit related to indexing very large chunks of content. It would take a long while. We never got the time to investigate the issue.

Concerning the timer, how would you do that?

On Sun, Dec 20, 2015, 22:24 Di Wu notifications@github.com wrote:

The concept of the buffer is to optimize indexing performance by utilizing bulk operations, but there's no guarantee that the buffer gets flushed in a timely fashion.

I think it should be necessary to use a timer with a configurable flush interval to make sure that things don't stay in the buffer indefinitely.

Another thing I noticed is that the buffer size check, bulk updating, and buffer emptying code isn't thread safe, if we have Django running with Gunicorn thread worker, then we have race condition.

There should be a Lock used to prevent any race conditions

— Reply to this email directly or view it on GitHub https://github.com/ChristopherRabotin/bungiesearch/issues/139.

diwu1989 commented 8 years ago

I think I'll submit another PR to try to deal with the flush issue and then there should be enough bugs fixed to qualify for a new release