jjchiw / gelf4net

GELF log4net Appender - graylog2
MIT License
63 stars 59 forks source link

Messages not sent and Context Variables mixed (Async) #63

Closed jlpedrosa closed 5 years ago

jlpedrosa commented 5 years ago

Hi!

We were doing some load testing when we realized that there was a significant amount of messages lost. After discarding the network, we have doing some sample code to serve as sort of test using only localhost. The test contains a dummy message parser to be able to conciliate the results.

We found that 60% of the messages don't arrive and also we detected that the LogicalThreadContext was being mixed between messages. I've tested both sync and sync implementations of the appender.

Am I missing something? wrong configuration?

Log4netplayground.zip

jjchiw commented 5 years ago

Hi...

I did some tests with graylog, using three appender udp, http and amqp

With udp and http there are some messages that are lost... I don't think that the library doesn't send them, I think it's just about the protocol and the amount of messages that we are sending to graylog using those inputs (udp, http) that graylog (or any server) can not handle that amount of messages....

With amqp all the messages arrived correctly...

About the LogicalThreadContext I think the test is incorrect :/ because the Task doesn't start at the same time and starts when the for has ended so i is the limit

for(int i=0; i < 10; i++)
{
    var task = Task.Run(() => Console.WriteLine(i));
}

this is the output

image

So, If you are sending an insanely concurrent ( :P ) amount of messages you should use other input for graylog, Gelf4net has amqp :)

I'll merge this PR https://github.com/jjchiw/gelf4net/pull/62 and will retry the test, I've faith in BufferedLogSender.cs

I'll write any update with the PR during the weekend

jlpedrosa commented 5 years ago

Hi

the reordering was do to "optimizations" of the compiler: https://medium.com/@Coder_HarryLee/how-to-capture-a-variable-in-c-and-not-to-shoot-yourself-in-the-foot-d169aa161aa6.

That solves the duplicated values, code same regarding async was correct....

I'm reproducing the tests with raw udp client/server insted of using gelf, I'd say packet loss gets slightly better, let me see if I can come up with some socket options to prevent message loss, will try to find out if messages are not being sent or not processed in the receive side. .

Definitely want fire and forget behaviour, but minimize the loss. Also in the past I found some funny behaviour in Windows sockets when the ip endpoint was not recheable at layer 2 and the client gets blocked for ARP resolution. see: https://issues.apache.org/jira/browse/LOG4NET-415

JL

jlpedrosa commented 5 years ago

Hi @jjchiw

I've been doing some tests with UDPClient and the gelf appender, both send all the messages, so the issue but be on the receiving side. Also the async context was the optimizer thing, so I think we are all done here. I will probably will create a new pull request for 415 of log4net on the new branch.

Thanks!