Open afaruga-atlassian opened 2 years ago
The special logic should be reverted.
I'd be careful with that. In our product, the change would cause a drastic efficiency drop and could lead to an incident. In other words, customers may rely on this feature even if it's buggy and never should be shipped.
Exposing a new API is one thing, and deprecating + unshipping buggy feature is another.
I see the whole change as:
I totally agree that change should be properly tested, rolled out carefully, and be backward compatible.
It could be reproduced and initially, even an extra API with observability could be added to gather more data about leaked connections and set
operations.
Currently, db-replica implements special logic for Postgres specific commands
set
. It contains a bug- the connections released back to the pool don't have cleared set operations. E.g set session timeout isn't rolled back, and as a result, it leaks betweenMAIN
andREPLICA
connections, which results in different than expected timeouts on connections.The special logic should be reverted. The new API should allow customers to define connection lifecycle logic on db-specific operations. It has to sustain current db-time effectiveness (% of traffic moved from MAIN to REPLICA) Specific operations should pass them between connections according to whatever db-replica is doing(connection state) to achieve an effective split. Such operations should be set per dual-connection lifecycle for all connections taken from provider, which have been actually used during this lifecycle.
Workaround: Currently if you use Postgress
set
commands, you have to cleanset
operations while releasing it to the pool.