Support "very wide" clusters

wohali commented 6 years ago

@janl:

(probably a variant on cluster-aware clients) : Scaling out clustered-couch to "very-large" ops per second is an interesting experience.

The basic premise that clusters contain all of the database is nice in that as you create copies of your data you get always-close behaviors, but there's a limit to individual cluster performance.

Depending on the load-balancing involved in reaching your database, how spread your installation/clients are and much hot-spotting you experience, you can much more easily end up in cases where reads/writes have to be shunted to another cluster and because of replication-delay apps start to see strange behavior.

One thing I've considered doing recently is using a transparent routing proxy on-top of couch that takes a provisioning configuration to locate clusters containing a specific database and use consistent-hashing to spread the keyspace transactions into predictable buckets, hinting back to the client what cluster-buckets were used.

While not necessarily a couch-specific feature, and still subject to its own nuances, it's a useful lesson in scaling for the inevitable limit with 2.0 installations.

@wohali:

Some of this may be possible with PSE and in-memory couch clones, or some sort of Redis compatibility layer. A lot of big big installs that I know have a Redis write-through cache that is hit first, that improves Couch performance drastically.

WillemvdW commented 5 years ago

@wohali is there documentation available on the Redis write-through cache implementation. We are currently busy with implementing a Redis based transaction manager across documents so that all documents in a workflow transaction is either saved or rolled back. Are there others implementing such functionality that we can share with and learn from?

wohali commented 5 years ago

@WillemvdW Sorry, company proprietary information I'm not at liberty to disclose. I'll poke some of the others involved in the work and see if they can respond publicly here.

I can confirm that none of that work is transactional in nature, though - the granularity is a single document at a time. CouchDB isn't a good store if you expect relational data absolutely consistently - we are an AP system, not a CP system - they don't call it NoSQL for nothing you know ;)

WillemvdW commented 5 years ago

Thanks @wohali . We would be interested to work and share with others in these two areas. On the transactional work, actually, we are dealing with sets of documents related a a work flow management. We are not interested in relational directly, but that the set of documents all save consistently or not.

Il'l give a use case. The workflow control document and 4 data documents that form part of the workflow action needs to be saved. In a poor network environment, the network fails before the workflow control document as the last step has been saved. In that case, they must all roll back.

Our solution support offline and online work, so the use case can get more complicated if you consider some data being updated online, but other through replication. It is possible that two users update the same workflow, the one offline, and the other online - part of the same workflow. if there is a conflict on one document, then we are experiencing problems of data inconsistency for the whole workflow.

I don't believe it is possible to create sophisticate apps that always work one document at a time. But we can easily batch them. Now we are making sure that whatever the reason, we always save the same batch, or if there is a failure we roll them all back, whether they were processed offline or online

wohali commented 5 years ago

@WillemvdW This is the wrong place for this discussion - please take it to one of these other options:

The user mailing list. Signup instructions are here
The Slack/IRC chat room. Joining instructions are here

apache / couchdb

Support "very wide" clusters #1527