apache / trafficcontrol

Apache Traffic Control is an Open Source implementation of a Content Delivery Network
https://trafficcontrol.apache.org/
Apache License 2.0
1.02k stars 339 forks source link

Add support for separate read only TODB connection #6540

Open jhg03a opened 2 years ago

jhg03a commented 2 years ago

This Feature Request affects these Traffic Control components:

Description

I'd like to see TO be able to leverage a second TODB connection string for read operations. This allows TO to shed large portions of load from the single authoritative write TODB instance to multiple separate possibly geospacially closer RO replica TODB instances. In addition to taking that load away, it provides a more clear and enforceable line between read and write operations within TO. Lastly, it also provides a partial failure mode where TO can partially operate should the single authoritative TODB instance become unavailable.

rawlinp commented 2 years ago

This has lots of implications in terms of data consistency, write-then-read, etc. Since most of the TODB load comes from caches, ideally I would like us to pursue something like https://github.com/apache/trafficcontrol/pull/4708 first, before we start looking at DB read replicas, because that would allow us to keep strong consistency and remove nearly all load on TODB that comes from caches.

jhg03a commented 2 years ago

If you're concerned that the replication time between postgres is a concern, you can always use synchronous replicas instead of async ones and that goes away. That said if you have read/writes in the same transaction it could go to the write host. I'd also wager that the one-way sync between postgres is faster than a second round trip from TO. It's not mutually exclusive to #4708, nor a statement that we shouldn't. This is so that we can leverage the built-in scaling aspects of postgres more effectively rather than trying to shape the load downward via data model and interaction changes. This wouldn't require the sql queries being issued by TO to change at all, just a more intentional choice of if it were modifying or not.

rawlinp commented 2 years ago

This wouldn't require the sql queries being issued by TO to change at all, just a more intentional choice of if it were modifying or not.

It might not require changing the queries themselves, but it would definitely change how every single query is being made, which is quite an extensive change to the codebase, not to mention the extra overhead it would incur on developers to follow the read vs write vs read+write convention it would create and the ensuing bugs from using the wrong DB handle.