coinbase / mongobetween

Apache License 2.0
111 stars 47 forks source link

Copy server error processing from driver #8

Closed mdehoog closed 4 years ago

mdehoog commented 4 years ago

We weren't calling server.ProcessError on any round trip errors. This means that any connection errors wouldn't mark the server as unknown, extending failures.

This PR adds the server.ProcessError call, and duplicates the wrapping of the error in a driver.Error just as driver.Operation does in the Go driver.

divjotarora commented 4 years ago

Hi @mdehoog,

I work on the Go Driver at MongoDB and saw this PR open up. I agree that this could cause prolonged failures and requests to nodes that aren't available. A few small questions/thoughts that came to mind:

  1. If it's possible to encounter a network timeout in the mongobetween proxy, I'd recommend bumping the driver version to v1.3.4 to pull in the change for https://jira.mongodb.org/browse/GODRIVER-1620. Before that fix, the driver was marking the server as Unknown and clearing its connection pool for all network errors, including timeouts. This caused a lot of connection churn if a timeout happened and the pool had a lot of connections. I don't think this affects the mongobetween proxy because the context used for the WriteWireMessage/ReadWireMessage calls is a cancel-able version of context.Background() and I don't see any calls to ClientOptions.SetSocketTimeout in the code, so network timeouts shouldn't be possible, but I figured I'd bring it up just in case.

  2. topology.Server.ProcessError accounts for some special command errors like NotMaster, NodeIsRecovering, ShutdownInProgress, etc, in addition to connection errors. Is this handling important for the proxy?

  3. Is it possible to write a test for this? The driver tests this using connection pool monitoring in https://github.com/mongodb/mongo-go-driver/blob/v1.3.4/mongo/integration/sdam_error_handling_test.go.

-- Divjot

mdehoog commented 4 years ago

Hey @divjotarora, thanks for the review and comment, glad to have you looking at mongobetween!

  1. I noticed that the topology.ConnectionError wasn't being unwrapped correctly in unwrapConnectionError, but hadn't realized that this was fixed in v1.3.4. Thanks for the heads up. Updated the driver to 1.3.4 in #9 and rebased this branch
  2. Yes I think it would be helpful to have this additional handling 👍
  3. Added a test, thanks for the reference.
divjotarora commented 4 years ago

Sorry if this wasn't clear, but point (2) in my previous comment was trying to say that the *Server.ProcessError function will handle things besides connection errors, but it's use in this PR will not. That is because it's called right after roundTrip but before Decode, so it will only ever be called for connection errors.

EDIT: After looking a bit more at the Decode function in mongobetween, I don't think simply moving the ProcessError call to be after Decode is enough. Contrast the case for OP_REPLY in mongobetween's Decode (https://github.com/coinbase/mongobetween/blob/master/mongo/operations.go#L45) with the same case in the driver's decodeResult (https://github.com/mongodb/mongo-go-driver/blob/v1.3.2/x/mongo/driver/operation.go#L1171). The driver's function first decodes the wire message and then calls extractError on the response document to get the exact error. The function in mongobetween skips the second step, calling ProcessError wouldn't actually process errors conveyed by the server in a well-formed response document.

mdehoog commented 4 years ago

Makes sense. I'll look at duplicating any relevant extractError functionality as a follow-up.