Open philbudne opened 1 year ago
@philbudne Does the fix for #102 and other recent changes resolve these issues or is it still a question to keep open?
PR https://github.com/mediacloud/story-indexer/pull/102 contains the fix to Issue https://github.com/mediacloud/story-indexer/issues/97 (which it looks like I accidentally reported data for in this Issue (and I've just deleted))
This can be tagged or milestoned as a long-term "survivability" issue; not critical right now (we're running one day at a time, and the queues are always clearing by end of day). If/as we move towards more state in the queues, their importance of queue robustitude is embigened.
NOTE: Quorum queues include better handling for messages that cause a worker to hang (go unacked): https://www.rabbitmq.com/docs/quorum-queues#poison-message-handling
I originally assumed we would run a RabbitMQ cluster with all three ES/storage servers.
In a larger cluster with a limited number of master(*)-eligible (or master-only) nodes, maybe run RabbitMQ on just those??
(*) Is it too P.C. of me to wish ES would switch to some other term?
An umbrella issue to empty my mind of RabbitMQ configuration issues: