mattwcole / gelf-extensions-logging

GELF provider for Microsoft.Extensions.Logging
MIT License
109 stars 42 forks source link

Message processing becomes a bottleneck when using the UDP client in an environment without DNS caching #50

Closed svenvanheugten closed 4 years ago

svenvanheugten commented 4 years ago

Every time GelfMessageProcessor processes a message, it calls GelfUdpClient.SendMessageAsync, which in turn calls UDPClient.SendAsync, which in turn does a DNS lookup on every call.

The impact of this becomes especially apparent when running in an environment without OS-level DNS caching, which is common when running in a container on Linux. In Kubernetes, for example, this implies a round-trip to a coredns pod (potentially not even running on the same node) for every log message, and as GelfMessageProcessor is single-threaded, this can cause logging to start lagging behind quite a bit under load. As the capacity of the message buffer is limited, this will at some point even start blocking application code.

Substituting the server's hostname with an IP address resolved these issues for us. I can wrap together an MR to solve this issue in the package itself, but I'm a bit torn on how much caching makes sense here.

mattwcole commented 4 years ago

Hi, thanks for reporting. I agree, it is wasteful to perform a DNS lookup for every request. Caching the lookup as you suggest sounds like a reasonable option. We could also just do the lookup once for simplicity - this would mirror how HttpGelfClient works since HttpClient doesn't respect DNS changes. I'd be interested to see how other popular logging libraries handle this.

As for the message buffer hitting capacity, this shouldn't block application code, however messages will be lost - the lesser of two evils in my opinion.

svenvanheugten commented 4 years ago

Hi, thanks for reporting. I agree, it is wasteful to perform a DNS lookup for every request. Caching the lookup as you suggest sounds like a reasonable option. We could also just do the lookup once for simplicity - this would mirror how HttpGelfClient works since HttpClient doesn't respect DNS changes. I'd be interested to see how other popular logging libraries handle this.

I looked at a few other logging libraries:

I was a bit worried how keeping around a bound UDP socket would affect fail-overs and load-balancing over different pods. I tested those things to be sure, and:

But it also suffices to just explicitly resolve the DNS entry in advance and pass it to SendAsync on the unbound UDPClient. Both solutions are probably reasonable.

As for the message buffer hitting capacity, this shouldn't block application code, however messages will be lost - the lesser of two evils in my opinion.

Whoops, sorry, didn't notice that bit of code. I agree.

mattwcole commented 4 years ago

Wow, great analysis, much appreciated! PR looks good, I'll get that merged and released as v2.0.1.