coinbase / mongobetween

Apache License 2.0
107 stars 45 forks source link

Fake an InterruptedDueToStepDown error on upstream topology-affecting errors #22

Closed mdehoog closed 3 years ago

mdehoog commented 3 years ago

Fixes #16.

Returns a fake InterruptedDueToStepDown error (code 11602) to the downstream application if the upstream server causes errors that would change the topology. This is to ensure that the application's view of the mongobetween server remains unchanged, and the connection pool isn't drained for no reason.

{
  "ok": 0,
  "code": 11602,
  "codeName": "InterruptedDueToStepDown",
  "errmsg": "interrupted due to shutdown"
}
divjotarora commented 3 years ago

I've only skimmed the code so far, but I have some thoughts based on the description:

  1. Per this list of error codes, 11602 is InterruptedDueToReplStateChange, not InterruptedDueToStepDown.

  2. 11602 is considered a "node is recovering error" by the the SDAM spec's error handling section. This is more generally considered a state change error, which will force the app servers to mark mongobetween Unknown, which I think we want to avoid. State changes errors clear the server's connection pool if they are shutdown errors or they're from a pre-4.2 server. 11602 is not in the list of shutdown error codes, but mongobetween's isMaster responses have maxWireVersion=7, which is 4.0, so this error will cause the app servers to clear connection pools.