serilog-contrib / serilog-sinks-splunk

A Serilog sink that writes to Splunk
https://splunk.com
Apache License 2.0
46 stars 47 forks source link

Compress http content #123

Closed edumserrano closed 2 years ago

edumserrano commented 4 years ago

Hi,

I had a custom implementation to send logs to Splunk and I was sending lots of logs to Splunk from AWS ec2 instances through a NAT gateway and not all the ec2 instances where in the same region as the NAT gateway. What this means is that I was being charged for a significant amount just in data transfer costs.

We ended up changing the networking topology for our solution but in the initial step what we did that helped reduce the costs significantly was to compress the content before sending to Splunk.

So our code to send to Splunk with compression looked like:

private async Task<HttpResponseMessage> Send(string splunkPayload)
{
    using (var request = new HttpRequestMessage(HttpMethod.Post, _url))
    {
        request.Headers.Authorization = new AuthenticationHeaderValue("Splunk", _hostConfiguration.Token);
        using (var compressedStream = await CompressWithGzipAsync(splunkPayload))
        {
            request.Content = new StreamContent(compressedStream);
            request.Content.Headers.Add("Content-Type", "application/json; charset=utf-8");
            request.Content.Headers.Add("Content-Encoding", "gzip");
            var response = await _httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, CancellationToken.None);
            return response;
        }
    }

    async Task<MemoryStream> CompressWithGzipAsync(string plaintext)
    {
        var output = new MemoryStream();
        using (var input = new MemoryStream(Encoding.UTF8.GetBytes(plaintext)))
        {
            using (GZipStream compressor = new GZipStream(output, CompressionLevel.Optimal, leaveOpen: true)) //disposing GZipStream guarantees a flush is made and all data is copied to the output stream
            {
                await input.CopyToAsync(compressor);
            }
        }
        output.Seek(0, SeekOrigin.Begin); // after a flush has been guaranteed by the dispose (could be explicit flush though) make sure to position the stream in the beginning
        return output;
    }
}

The important part is the method CompressWithGzipAsync where the splunk payload (which for my custom implementation, and I believe for serilog-sinks-splunk as well, can be one or many messages batched) would be compressed as opposed to just doing https://github.com/serilog/serilog-sinks-splunk/blob/0bc82f5492154b24b5054abe5809df1b525662ec/src/Serilog.Sinks.Splunk/Sinks/Splunk/EventCollectorRequest.cs#L33

Is this an enhancement that you believe is worthwhile adding?

I can't speak for compatibility to all Splunk versions in terms of whether they all accept compressed content or not, or if something in Splunk needs to be enabled to accept compressed content. However I believe this is a worthwhile enhancement which could be toggled on or off via an extra parameter added to the LoggerSinkConfiguration.EventCollector methods. I mean add a "bool compressContent" to SplunkLoggingConfigurationExtensions methods such as: https://github.com/serilog/serilog-sinks-splunk/blob/0bc82f5492154b24b5054abe5809df1b525662ec/src/Serilog.Sinks.Splunk/SplunkLoggingConfigurationExtensions.cs#L55-L71

HakanL commented 4 years ago

It should be possible to add this using the HttpMessageHandler "override" in the constructor, instead of doing a custom implementation in the sink. See this link for an example implementation.

merbla commented 2 years ago

Closing as a part of larger Serilog contrib reorg

Checkout serilog/serilog#1627