elastic / ecs-dotnet

https://www.elastic.co/guide/en/ecs-logging/dotnet/current/setup.html
Apache License 2.0
114 stars 58 forks source link

[BUG] EcsTextFormatter 8.6.0 throws an exception when Enrich.WithElasticApmCorrelationInfo() is used #324

Closed ventii closed 1 year ago

ventii commented 1 year ago

ECS integration/library project(s) (e.g. Elastic.CommonSchema.Serilog): Elastic.CommonSchema.Serilog 8.6.0 Elastic.Apm.SerilogEnricher 8.6.0 Elastic.Apm.NetCoreAll 1.22.0

ECS schema version (e.g. 1.4.0): 8.6.0

ECS .NET assembly version (e.g. 1.4.2): 8.6.0

Elasticsearch version (if applicable): 8.8.2

.NET framework / OS: .net 7 / Windows 10 EDIT: this also happens running on a .net7 alpine image

Description of the problem, including expected versus actual behavior:

Expected behaviour: Logs with serilog in ECS format Actual behaviour: No logs generated (because of silent exceptions explained below)

When logging through Serilog sinks with EcsTextFormatter and APM enricher WithElasticApmCorrelationInfo(), the EcsTextFormatter throws an exception when converting the log while trying to read the service details.

This appears to be because Agent.Config.ServiceNodeName is null, and is being added as null to the event property ElasticApmServiceNodeName rather than not added at all by the enricher extension WithElasticApmCorrelationInfo().

MethodGetService() of LogEventConverter will try to find property ElasticApmServiceNodeName which is set as a null value, and therefore fails with a null reference error. LogEventConverter line 80:

image

Steps to reproduce:

  1. Create .net 7 app with packages Elastic.CommonSchema.Serilog, Elastic.Apm.SerilogEnricher, Elastic.Apm.NetCoreAll
  2. add a serilog Console Sink with EcsTextFormatter, and add Enricher WithElasticApmCorrelationInfo()
  3. add APM host.UseAllElasticApm();

image

  1. create a test log _logger.LogInformation("my name is {name}", "joe");

At this point, you will see that the log will not be printed to console. If we add ourselves the value of ServiceNodeName through config ElasticApm:ServiceNodeName like the following and do the test again, we will manage to print the log:

image

What is the reason for Agent.Config.ServiceNodeName being null?

nathangiuliani commented 1 year ago

I think we are also experiencing this issue. Interestingly it's fine with our app running on Windows, but this problem is present on Linux including debugging in WSL. Also interestingly it's not all log messages that fail - we get a handful of startup messages before this happens, and then all log entries are dropped.

ventii commented 1 year ago

@nathangiuliani you probably manage to get startup logs because at that Point APM is not initialised, meaning the ECS formatter will not even try to append the APM related fields to the log document

nathangiuliani commented 1 year ago

@ventii fair point, you're right there.

With some more playing i actually got this working - the problem doesn't seem to be in this library, but one of the other elastic or apm libraries that I originally updated at the same time as this one. I'd imagine it's the APM libraries which we have at 1.19.

nathangiuliani commented 1 year ago

Update, it happens when Elastic.Apm.SerilogEnricher is updated from 1.5.3 to 8.6.0.

ventii commented 1 year ago

@nathangiuliani yes exactly, as that package will enrich the log with 3 new fields

however, ElasticApmServiceNodeName is being saved as null value.

This in turn causes Elastic.CommonSchema.Serilog to fail when updated to 8.6.0, as it tries to read and parse ElasticApmServiceNodeName, which is saved, but as null.

Mpdreamz commented 1 year ago

I believe this was fixed in: https://github.com/elastic/ecs-dotnet/pull/306 will do a 8.6.1 release today.

Many thanks for reporting this!