elastic / apm-agent-dotnet

https://www.elastic.co/guide/en/apm/agent/dotnet/current/index.html
Apache License 2.0
586 stars 208 forks source link

[BUG] The APM agent endlessly sends “event exceeded the allowed size” error to the APM Server #2448

Closed OlegUfaev closed 1 month ago

OlegUfaev commented 2 months ago

APM Agent version

1.29.0

Environment

Operating system and version: Linux, Windows

.NET Framework/Core name and version: .NET 8

Application Target Framework(s): .NET 8

Describe the bug

When an APM Agent sends an event that exceeds the max_event_size parameter set on the APM Server (defaults to 307200 bytes), the APM Server will return an error event exceeded the allowed size.

The exception will be logged and a new Error object will be created (also of huge size, because of ApmServerResponseContent), which will be added to the queue to be sent to the APM Server.

https://github.com/elastic/apm-agent-dotnet/blob/31cb8cb6a9f065435838231cfcf098bc670f93e1/src/Elastic.Apm/Report/PayloadSenderV2.cs#L421-L431

The delivery of this Error to the APM Server will again end with event exceeded the allowed size error, as a result of which a new Error object will again be created and added to the queue. And so on and so on, over and over again. This will go on indefinitely.

To Reproduce

Steps to reproduce the behavior:

  1. Get source code: https://github.com/elastic/apm-agent-dotnet/tree/v1.29.0
  2. Open project WebApiExample
  3. In appsettings.json add required ElasticApm configuration for your APM Server instance (ServerUrl, SecretToken, etc. )
  4. Replace WeatherForecastController implementation (see code snippet below)
  5. Launch the project and open the link in your browser: https://localhost:64661/WeatherForecast
  6. Error event exceeded the allowed size will appear in the log every 10 seconds
using System.Diagnostics;
using Microsoft.AspNetCore.Mvc;

namespace WebApiExample.Controllers;

[ApiController]
[Route("[controller]")]
public class WeatherForecastController : ControllerBase
{
    private static readonly string[] Summaries =
    [
        "Freezing", "Bracing", "Chilly", "Cool", "Mild", "Warm", "Balmy", "Hot", "Sweltering", "Scorching"
    ];

    [HttpGet(Name = "GetWeatherForecast")]
    public IEnumerable<WeatherForecast> Get()
    {
        // We do this to exceed the limit set in max_event_size (307200 bytes)
        using var activity = new ActivitySource("test").StartActivity();
        for (var i = 1; i <= 32; i++)
        {
            activity?.AddTag("key" + i, new string('1', 10000));
        }

        return Enumerable.Range(1, 5)
            .Select(index => new WeatherForecast
            {
                Date = DateOnly.FromDateTime(DateTime.Now.AddDays(index)),
                TemperatureC = Random.Shared.Next(-20, 55),
                Summary = Summaries[Random.Shared.Next(Summaries.Length)]
            })
            .ToArray();
    }
}

Expected behavior

The APM Agent will log an exception raised when delivering events to the APM Server, but will not create a new Error that will cause the same delivery problems.

OR maybe you shouldn't log content of serialized events. After all, there may be hundreds of kilobytes or even megabytes there - it's very expensive.

Actual behavior

APM Agent unsuccessfully tries to send the event exceeded the allowed size error information over and over again.

OlegUfaev commented 1 month ago

@Mpdreamz, please look into this bug

Mpdreamz commented 1 month ago

Thanks for the nudge @OlegUfaev this seemed to have dropped off my queue

Opened https://github.com/elastic/apm-agent-dotnet/pull/2460 to address this.