EndPointCorp / end-point-blog

End Point Dev blog
https://www.endpointdev.com/blog/
17 stars 65 forks source link

Comments for Common Topics in Scalability #246

Open phinjensen opened 7 years ago

phinjensen commented 7 years ago

Comments for https://www.endpointdev.com/blog/2010/01/common-topics-in-scalability/ By Ethan Rowe

To enter a comment:

  1. Log in to GitHub
  2. Leave a comment on this issue.
phinjensen commented 7 years ago
original author: Greg Sabino Mullane
date: 2010-01-04T22:18:58-05:00

"The total number of slave databases is likely limited by the capacity of the master; each additional slave adds some overhead to the master, so diminishing returns eventually kick in."

To expand on the "likely" caveat offered above: the limitation can be greatly raised by using cascading slaves (at least in Bucardo). For example, a master database could replicate to five slaves, which in turn each go to 10 other slaves, for a total of 56 database boxes that one can query, with minimal impact on the master.

phinjensen commented 7 years ago
original author: Steph Powell
date: 2010-01-04T23:14:56-05:00

great article.

phinjensen commented 7 years ago
original author: Ethan Rowe
date: 2010-01-04T23:42:39-05:00

Greg: good point. The cascading slaves strategy theoretically brings about a larger inconsistency window, but that window should still presumably be pretty small.

Perhaps the means of replication bears mentioning here. A point of contrast between MySQL's stock master/slave replication and Bucardo on Postgres (and I believe this applies to Londiste and Slony as well; one of my illustrious colleagues will no doubt correct me) is that Bucardo replication uses triggers to manage metadata about state changes in each replicated table. Consequently, write operations to your main tables bring additional writes (on the bucardo tables). Replication events also bring writes. MySQL uses a binary log that amounts to a sequential log of statements executed; there is no "replication event", with slaves merely pulling in the latest information from the binary logs and replaying those statements locally. In the absence of metrics, I'm guessing that MySQL can likely support a higher number of slaves than the trigger-based Postgres solutions would before seeing diminishing returns.

So simply adding master/slave replication is more complex than merely offloading reads. In the Postgres space, it probably means adding write overhead to the master database.

Which may sound like I'm knocking the Postgres replication solutions out there; I'm not. I've used Bucardo and I've used MySQL's master/slave replication. I'd take Bucardo any day for any serious database. You get way more control. The customcode hooks, the opportunity to introduce business logic into the proceedings to transform/aggregate/etc the data on the way to the target database, etc.; these things make a tool like Bucardo a part of your enterprise and not just a database replication solution.

Oh, wait. Oracle is going to end Postgres. We're all doomed.

phinjensen commented 7 years ago
original author: Steven Jenkins
date: 2010-01-06T09:50:05-05:00

Very good article. I would add another item, though, along the lines of "understand where the impact of hardware". Sometimes additional hardware (either added horizontally or vertically) can make a significant difference, and other times it won't. Often, system vendors focus on selling additional hardware, and that's not always the right move.