basho / yokozuna

Riak + Solr
245 stars 76 forks source link

How to handle (or best-listen for) fuse melt/blown events? [JIRA: RIAK-2259] #587

Closed zeeshanlakhani closed 8 years ago

zeeshanlakhani commented 9 years ago

Possibly retry batches for search_index, or clear solrq worker(s)? Investigate.

zeeshanlakhani commented 8 years ago

@fadushin the retry work should cover this or shall we leave it open?

fadushin commented 8 years ago

I think so, though I don't understand the full genesis of this ticket. In the current batching branch, we listen for fuse blown/recovered events, and we pause batching for indexes that are blown. Once they recover, batching resumes. The net effect is that any messages are retried once the fuse recovers, and there is an EQC test to this effect.

zeeshanlakhani commented 8 years ago

@fadushin yep... the genesis was to see how we handle things. The #565 branch just moves on... so i think we're covered!

JeetKunDoug commented 8 years ago

@fadushin @zeeshanlakhani please close if this is done

fadushin commented 8 years ago

This work is covered in the batching batches and retry logic that was added to the worker/helper protocol.

Basho-JIRA commented 8 years ago

Fixed in 2.0.7. We now purge messages when an index fuse is blown, and we hit the HWM. Otherwise, the message is retried.

_[posted via JIRA by Fred Dushin]_