getlantern / http-proxy-archived

HTTP Proxy with TLS support
Apache License 2.0
193 stars 59 forks source link

Suppress noisy dial errors #138

Closed hwh33 closed 3 years ago

hwh33 commented 3 years ago

The current top error in StackDriver is "i/o timeout": link.

The stack trace indicates that this is logged here, which means the source of the error is this call to the next filter. The RecordOp filter is only used by proxy code here, which means that the next filter would be the cleanheadersfilter created on the next line. A quick inspection reveals that the cleanheadersfilter does not generate errors of its own.

Looking at the use of the filter chain, we can see that, at least for non-WSS proxies, the cleanheadersfilter is the last in the chain. So the next function returning the error must be provided by whoever calls Apply on the filter chain. As seen in the previous link, the filter chain is provided to getlantern/http-proxy/server.New, where it is passed to getlantern/proxy.New. There, Apply is only ever called here, which means that next is initialized as nextCONNECT or nextNonCONNECT.

In the case of nextCONNECT, an error is only returned here (n.b. badGateway returns the error unmodified), which means the original "i/o timeout" came from the dial.

In the case of nextNonCONNECT, an error is only returned here, which means that the original "i/o timeout" came from the RoundTrip call (and presumably a Read call on the underlying transport connection). In either case, the error appears to originate in the net package and should pass errors.As for a *net.OpError.

The proposal in this PR is to suppress these network timeout errors as they are out of our control and therefore just noise. I'm still logging them on DEBUG so that we can see them in proxy logs if we'd like.

Side-note: the filters mechanism makes tracing these kinds of things pretty onerous. I wonder if we can come up with something better.

hwh33 commented 3 years ago

Thanks @benjojo!