Closed ghostbuster91 closed 4 years ago
@adamw @ghostbuster91 I believe this configuration isn't optimal to scale, given that right now producers and consumers risk to be paired on different live brokers, forcing message redistribution among brokers (and replications too): we're still working on a solution that would correctly suggest which broker to connect with ie affinity.
IMO with the number of clients in the test, using a single live broker is enough, while the others are there just to save from split brain scenario. I hope that help: that's probably why the results are so different from 3 years ago :)
I suggest to turn off message-load-balancing, setting an non-existing bogus address name as the one that's going to be re-distributed (to save notifications to move across cluster nodes) and it should scale way better then just using several nodes.
So a setup with a single live broker, single backup and one standby node to resolve splits would perform better than 3 live-backup pairs working in parallel?
with a single live broker, single backup and one standby node to resolve splits would perform better than 3 live-backup pairs working in parallel?
In general yes, in absence to a smart (or ad-hoc) work/clients partitioning among broker nodes: this is explained well with the http://www.perfdynamics.com/Manifesto/USLscalability.html that shows that the cross-talk penalty associated with coherence communications (ie the messages redistributions across nodes, acks, notifications etc etc) can lead to an exponential diminishing return in the scalability eg
That looks very similar to what we get on your bench results :)
Related this part of your comment
with a single live broker, single backup and one standby node
Right now, with the current quorum vote implementation we are forced to have 3 live pairs, because backups won't participate to the vote. This is something we're addressing for the next release, using a different quorum algorithm, hence the suggested topology would be to use 3 lives, no message redistribution among them, and a single backup to serve one specific live. Clients should just connect to a single live unless a fail on it is going to happen and will move to the backup.
I let @michaelandrepearce comment if he has other useful suggestions too :)
Thanks for the link and the explanation :)
Right now, with the current quorum vote implementation we are forced to have 3 live pairs, because backups won't participate to the vote.
Ok, so if we are testing current versions, the setup we have is "correct" if we want to have data replication & a split-brain-safe cluster?
Ok, so if we are testing current versions, the setup we have is "correct" if we want to have data replication & a split-brain-safe cluster?
Yes, although the backups on the other "witness" lives are not necessary (but still can exists) if you don't plan to distribute any work to them, they only participate to the quorum vote. And, is important to mention, that message (re)distribution among lives shouldn't happen, as I've mentioned in the previous comment (including clients connection to any live node of the cluster).
And, is important to mention, that message (re)distribution among lives shouldn't happen, as I've mentioned in the previous comment (including clients connection to any live node of the cluster).
Anything we should change in the current config to prevent that?
@adamw
Yes, the address
param on cluster-connection
should be set with some impossible and not existing address, to prevent bindings and other notifications to be distributed between cluster nodes and using OFF
as load balancing policy too should do the trick!
@adamw Let me know if the explanation was enough; I can try help validating the config too :+1:
@adamw Any news on https://github.com/softwaremill/mqperf/pull/47#issuecomment-779791673 There is something I can do to help eg sending a PR to fix this?
I would suggest validating if this actually improves perf for your setup.