Closed IrisClasson closed 3 years ago
@IrisClasson Thank you for reporting this issue.
Is there any particular reason why this sink handles the retries instead of letting Serilog manage them? There seems to be some side effects that can be avoided.
A second problem is that the exceptions from bad requests are currently being swallowed. By letting Serilog do the retry the exception bubbles up, and is outputted to the SelfLog. This makes debugging easier, and avoids cluttering of the SelfLog bu adding manual entries. That seems to be the recommended way to do it.
I need to gather some information to answer to these two questions.
For example, CloseAndFlush on the logger doesn't seem to work which means the logs are lost if the application closes down before the batch is sent. We are assuming this is a side-effect of hijacking the retries.
Can you explain in which case CloseAndFlush
is not working?
Consider the following code:
Log.Logger = new LoggerConfiguration()
.WriteTo.DatadogLogs("API_KEY")
.CreateLogger();
Log.Information("Test serilog");
Log.CloseAndFlush(); // This is line 60
I set a breakpoint at https://github.com/DataDog/serilog-sinks-datadog-logs/blob/ebe50a2205e1fe334416e0d33ea54c6987a1aac3/src/Serilog.Sinks.Datadog.Logs/Sinks/Datadog/DatadogHttpClient.cs#L108
As you can see, the log is flushed when calling Log.CloseAndFlush();
which is located at line 60.
Thanks for replying and investigating!
CloseAndFlush() wasn't consistently sending the entries, but when trying to reproduce this today I was unable to do so.
Did you have a chance to investigate why the sink doesn't let SeriLog handle the retries? If you make the request fail and enable SelfLog you won't see any exceptions outputted.
@IrisClasson ,
Our sink calls SelfLog
when it cannot send a message. See https://github.com/DataDog/serilog-sinks-datadog-logs/blob/ebe50a2205e1fe334416e0d33ea54c6987a1aac3/src/Serilog.Sinks.Datadog.Logs/Sinks/Datadog/DatadogHttpClient.cs#L119
Consider the following code:
static void Main(string[] args)
{
Serilog.Debugging.SelfLog.Enable(msg => Console.WriteLine(msg));
Log.Logger = new LoggerConfiguration()
.WriteTo.DatadogLogs(API_KEY, configuration: new DatadogConfiguration{ Url = "InvalidHost"})
.CreateLogger();
Log.Information("Test serilog");
Log.CloseAndFlush();
}
It displays
2020-09-10T10:11:08.6100920Z Could not send payload to Datadog: [{"Timestamp":"2020-09-10T12:08:08.445887+02:00","level":"Information","MessageTemplate":"Test serilog","message":"Test serilog","ddsource":"csharp"}]
As MaxBackoff
is 30 and MaxRetries
is 10, it takes around 300 seconds to display an error.
Is your issue the formating of the message? https://github.com/DataDog/serilog-sinks-datadog-logs/blob/ebe50a2205e1fe334416e0d33ea54c6987a1aac3/src/Serilog.Sinks.Datadog.Logs/Sinks/Datadog/DatadogHttpClient.cs#L119
Do you want to have a way to set MaxRetries
to see an error message faster?
DatadogSink
implements PeriodicBatchingSink
which handles the retries at https://github.com/serilog/serilog-sinks-periodicbatching/blob/313598fd6c0747489c0fdf3949238dd75e60dae0/src/Serilog.Sinks.PeriodicBatching/Sinks/PeriodicBatching/PeriodicBatchingSink.cs#L209-L258. PeriodicBatchingSink
considers a batch (IEnumerable<LogEvent>
) as sent or not sent. For a single batch, DatadogSink
can sent several HTTP requests, some of them can succeed and others can failed. Throwing an exception in this case lead to retry the full batch (IEnumerable<LogEvent>
) and so sending several times the same logs.
As the retry logic of PeriodicBatchingSink
cannot be configured, is there a reason to use it instead of the one in DatadogSink
?
Thank you for the reply @ogaca-dd . I saw the recent merge and that fixes the exception handling, which solves the problem for us. I'll close this issue.
Appreciate the help!
Hello again and thanks for the work on the sink! We are quite keen on using this sink at work but have noticed a few things that brought up some questions/concerns.
Is there any particular reason why this sink handles the retries instead of letting Serilog manage them? There seems to be some side effects that can be avoided.
For example, CloseAndFlush on the logger doesn't seem to work which means the logs are lost if the application closes down before the batch is sent. We are assuming this is a side-effect of hijacking the retries.
A second problem is that the exceptions from bad requests are currently being swallowed. By letting Serilog do the retry the exception bubbles up, and is outputted to the SelfLog. This makes debugging easier, and avoids cluttering of the SelfLog bu adding manual entries. That seems to be the recommended way to do it.
I'll push a PR with an example.