twitter-archive / kestrel

simple, distributed message queue system (inactive)
http://twitter.github.io/kestrel
Other
2.77k stars 312 forks source link

Distributing kestrel #11

Closed Suhail closed 14 years ago

Suhail commented 14 years ago

How exactly do you even distribute kestrel?

Most common memcached libs won't round robin read when it receives a None since they hash the key and horizontally partition.

What are people doing since I see nothing available short of building my own way to distribute tonight.

elephantum commented 14 years ago

As far as I understand reading None is not an issue at twitter. AFAIK: they have some (3, I guess) kestrel servers and a lot more workers. Each worker does the following: choose random server, read message, process message, repeat (may be chosen server is used for several iterations since operation like /close/open exists).

Anyway workers outnumber servers and choose server at random, so, eventually, each server would be read without any complicated logic.

Suhail commented 14 years ago

I resolved it to the following:

When all queues are zero, it will simply just bounce between them (which is kind of shitty) so I set a higher /t=

robey commented 14 years ago

both of those sound fine. we pick servers at random, and if the queue is empty, sleep for a while (exponential backoff) and then try another.

jkalucki has been experimenting with a much more aggressive client in the "grabby-hands" project.

ebarlas commented 14 years ago

Would it make sense to select a kestrel server based on the hash of a queue name? It seems to me that would be most efficient, though you may end up with small queues on some kestrel servers and large queues on others.

robey commented 14 years ago

having a queue exist on only one (or even a subset) of the kestrels in a cluster would defeat the purpose of having the cluster, though.

ebarlas commented 13 years ago

It's been a year since this thread was active. Any news on ways to distribute kestrel? Preferred or best client policies?

robey commented 13 years ago

we still generally have clients pick a server at random in ruby, using kestrel-client (https://github.com/twitter/kestrel-client). in java/scala, we have most clients connect to every server at once and launch a timeout-get on each one to minimize latency. those use grabby-hands (https://github.com/jkalucki/grabby-hands).

is that what you meant?

ebarlas commented 13 years ago

Yeah, that's exactly what I was wondering. I will look into those clients.