serilog-contrib / serilog-sinks-elasticsearch

A Serilog sink that writes events to Elasticsearch
Apache License 2.0
434 stars 196 forks source link

Empty lines consumed by ElasticsearchPayloadReader lead to NullReferenceException #412

Open andrey-kozlov-skuvault opened 2 years ago

andrey-kozlov-skuvault commented 2 years ago

A few questions before you begin:

Is this an issue related to the Serilog core project or one of the sinks or community projects.
This issue list is intended for Serilog Elasticsearch Sink issues. If this issue relates to another sink or to the code project, please log on the related repository. Please use Gitter chat and Stack Overflow for discussions and questions.

Does this issue relate to a new feature or an existing bug?

What version of Serilog.Sinks.Elasticsearch is affected? Please list the related NuGet package. v8.4.1

What is the target framework and operating system? See target frameworks & net standard matrix.

Please describe the current behavior? There is a possibility to get NullReferenceException in PostData.Write from the payload provided by the ElasticsearchPayloadReader, in case e.g. file contains empty lines. PostData.Write is called in ElasticSearchLogShipper.

Please describe the expected behavior? There should be no exception. And in general, I think there is no sense to add empty or whitespace lines to the Elasticsearch anyway, so need to skip them.

If the current behavior is a bug, please provide the steps to reproduce the issue and if possible a minimal demo of the problem This test shows the problem - we get exception for null and "" in data.Write(…):

using System;
using System.IO;
using System.Text;
using Elasticsearch.Net;
using FluentAssertions;
using Serilog.Sinks.Elasticsearch.Durable;
using Xunit;

namespace Serilog.Sinks.Elasticsearch.Tests;

public class ElasticsearchPayloadReaderTests : IDisposable
{
    private readonly string _tempFileFullPath;

    public ElasticsearchPayloadReaderTests()
    {
        _tempFileFullPath = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid():N}-{20000101}.json");
    }

    public void Dispose()
    {
        System.IO.File.Delete(_tempFileFullPath);
    }

    [Theory]
    [InlineData("")]
    [InlineData(null)]
    [InlineData(" ")]
    public void ReadPayload_SkipsEmptyLines(string emptyLine)
    {
        // Arrange
        var payloadReader = new ElasticsearchPayloadReader("testPipelineName", "TestTypeName", null,
            (_, _) => "TestIndex", ElasticOpType.Index);
        var lines = new[]
        {
            "line1",
            emptyLine,
            "line2"
        };
        // Important to use UTF8 with BOM if we are starting from 0 position 
        System.IO.File.WriteAllLines(_tempFileFullPath, lines, new UTF8Encoding(true));

        // Act
        var fileSetPosition = new FileSetPosition(0, _tempFileFullPath);
        var count = 0;
        var payload = payloadReader.ReadPayload(int.MaxValue, null, ref fileSetPosition, ref count, _tempFileFullPath);

        // Assert
        // Before the fix, these lines would provide NullReference exception
        var data = PostData.MultiJson(payload);
        data.Write(new MemoryStream(), new ConnectionConfiguration());
    }
}
andrey-kozlov-skuvault commented 2 years ago

Addressed in https://github.com/serilog-contrib/serilog-sinks-elasticsearch/pull/415