Closed ezrapagel closed 5 years ago
@ezrapagel Thanks for contributing this fix! It's overall looks good to me. Will take some extra testing and merge it.
Awesome. I have some cosmetic refactors that I'll submit a PR for later down the road. Is your load test suite something that you'd consider adding to the repo? It'd be useful even if it required some config outside of a test framework.
Since it's a real load testing so need extra work for handling the secrets in Ci before adding to public repo (so the traffic can go to our staging server and being verified with API) But sooner or later I will do it
This PR adds support for asynchronous buffered message sending.
In production services, we are regularly experiencing threadpool exhaustion due to queued
BufferedFlushingTask
s blocked on CLR worker threads. When sending apx 5-10k messages per minute, we produced a backlog of 2044 blocked threads on a Threadpool instance w/ a max of 2047. That's due toSumoLogicMessageSender.TrySend
invokingPostAsync
with a blockingResult
call, which can cause threadpool exhaustion when using the buffered sender under heavy load. This change removes the blockingResult
and surfaces the async Task to theTimer
, resulting in far reduced CLR worker thread-pool usage and slightly increased IOCP thread usage (10 and 20 respectively under the same load, averaged samples).Since there's no task execution order guaranteed, the unbuffered sender is left synchronous to prevent out-of-order message arrival. While the buffered sender also suffers from non-deterministic task ordering, it's no worse than the current threadpool dispatch model, and only a risk for very small
FlushingAccuracy
values. So don't do that.This code is running in production environments processing 5-15k messages per minute per endpoint using the log4net appender. No real-world testing was performed for either the NLog or SeriLog implementations. The NLog
FlushAsync
method needs to be reviewed by someone more familiar w/ NLog's async continuation mechanism.