markmcdowell / NLog.Targets.ElasticSearch

NLog target for Elasticsearch
MIT License
176 stars 89 forks source link

System.InvalidOperationException: Collection was modified #113

Closed andronachev closed 4 years ago

andronachev commented 4 years ago

Hi,

We started noticing this kind of exception in our .NET Core project:

{
  "ClassName": "NLog.NLogRuntimeException",
  "Message": "Exception occurred in NLog",
  "Data": {
    "NLog.ExceptionLoggedToInternalLogger": true
  },
  "InnerException": {
    "ClassName": "System.InvalidOperationException",
    "Message": "Collection was modified; enumeration operation may not execute.",
    "Data": {
      "NLog.ExceptionLoggedToInternalLogger": true
    },
    "InnerException": null,
    "HelpURL": null,
    "StackTraceString": "   at System.Collections.Generic.List`1.Enumerator.MoveNextRare()\r\n   at System.Collections.Generic.List`1.Enumerator.MoveNext()\r\n   at NLog.Targets.ElasticSearch.ElasticSearchTarget.FormPayload(ICollection`1 logEvents)\r\n   at NLog.Targets.ElasticSearch.ElasticSearchTarget.SendBatch(ICollection`1 logEvents)",
    "RemoteStackTraceString": null,
    "RemoteStackIndex": 0,
    "ExceptionMethod": null,
    "HResult": -2146233079,
    "Source": "System.Private.CoreLib",
    "WatsonBuckets": null
  },
  "HelpURL": null,
  "StackTraceString": "   at NLog.LoggerImpl.<>c__DisplayClass1_0.<Write>b__1(Exception ex)\r\n   at NLog.Internal.SingleCallContinuation.Function(Exception exception)\r\n   at NLog.Targets.ElasticSearch.ElasticSearchTarget.SendBatch(ICollection`1 logEvents)\r\n   at NLog.Targets.ElasticSearch.ElasticSearchTarget.Write(AsyncLogEventInfo logEvent)\r\n   at NLog.Targets.Target.WriteAsyncThreadSafe(AsyncLogEventInfo logEvent)\r\n   at NLog.Targets.Target.WriteAsyncLogEvent(AsyncLogEventInfo logEvent)\r\n   at NLog.LoggerImpl.WriteToTargetWithFilterChain(Target target, FilterResult result, LogEventInfo logEvent, AsyncContinuation onException)\r\n   at NLog.LoggerImpl.Write(Type loggerType, TargetWithFilterChain targetsForLevel, LogEventInfo logEvent, LogFactory factory)\r\n   at NLog.Logger.Log(Type wrapperType, LogEventInfo logEvent)\r\n   at NLog.Extensions.Logging.NLogLogger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)\r\n   at Microsoft.Extensions.Logging.Logger.Log[TState](LogLevel logLevel, EventId eventId, TState state, Exception exception, Func`3 formatter)",
  "RemoteStackTraceString": null,
  "RemoteStackIndex": 0,
  "ExceptionMethod": null,
  "HResult": -2146233088,
  "Source": "NLog",
  "WatsonBuckets": null
}

Which I am not sure if it is a bug in the NLog internal library or something on our side ... but I have seen some other tickets open, related to this exception, in the past (4-5 years ago, and the tickets are closed, so I doubt we have the same issue as those).

We are using the following packages:

`

`

In a .NET Core 2.2 project, which is hosted in an Azure Function.

I'd like to mention one thing; that there is some parallel threading going on in our solution (i.e. I am logging the execution of our function on a parallel thread)... I am not sure if that could cause the issue (because this type of exception is something I've seen in the past where some sort of collection was not thread safe). However I would assume the ILogger framework provided in .NET Core is thread safe

Thoughts ?

Vlad

p.s. I initially posted this issue in the Nlog repository here: https://github.com/NLog/NLog/issues/3804 but I should have put it here since the stack trace ends in the NLog.Targets.ElasticSearch namespace..

markmcdowell commented 4 years ago

Hey, I don't see anything immediately that would cause it. It's quite a simple target which loops through the logevents and I don't think they should change.

Could you turn on https://github.com/NLog/NLog/wiki/Internal-Logging and see if it's logging out anything useful?

andronachev commented 4 years ago

Hey, I don't see anything immediately that would cause it. It's quite a simple target which loops through the logevents and I don't think they should change.

Could you turn on https://github.com/NLog/NLog/wiki/Internal-Logging and see if it's logging out anything useful?

I enabled internal logging to text file. I will come back on this as we haven't got the error in about a week now, so waiting for it to happen after this

Thanks

snakefoot commented 4 years ago

I can see these foreach-loops for collections in NLog.Targets.ElasticSearch ver. 5.1.0:

https://github.com/markmcdowell/NLog.Targets.ElasticSearch/blob/815d9d70d74d98210266cb4710a16a4e4551f5f8/src/NLog.Targets.ElasticSearch/ElasticSearchTarget.cs#L215-L217

The logEvents-collections comes from a newly created array so very sure this is not the right one.

https://github.com/markmcdowell/NLog.Targets.ElasticSearch/blob/815d9d70d74d98210266cb4710a16a4e4551f5f8/src/NLog.Targets.ElasticSearch/ElasticSearchTarget.cs#L233

The Fields-collection will only be modified before target is initialized, and not while actively writing. BUT if you have custom user-code that modifies this collection then you will be hurt.

https://github.com/markmcdowell/NLog.Targets.ElasticSearch/blob/815d9d70d74d98210266cb4710a16a4e4551f5f8/src/NLog.Targets.ElasticSearch/ElasticSearchTarget.cs#L252

The Properties-collection returns a dictionary-enumerator, so it cannot be be case. Since the exception says List1.Enumerator.MoveNextRare()`.

So my money is on some random logic in your application that modifies the Fields-collection on the Target at runtime. But please investigate more, maybe include PDB-symbol-files so getting proper source-code-line-numbers.

andronachev commented 4 years ago

@snakefoot thanks very much for your comment - indeed, we are (were) modifying the Fields collection at runtime - it was at the beginning of the invocation of the Azure Function and I didn't think it would harm it - seems like a function instance is called concurrently and then this happens, of course.

The purpose was to add to the Fields collection, the "InvocationId" which I only get at runtime when the function is called. I removed that code and will search a different way of doing it. I will close this ticket.

Thanks, Vlad

snakefoot commented 4 years ago

One option could be use the NLog MDLC and then do this:

<target xsi:type="ElasticSearch">
  <field name="InvocationId" layout="${mdlc:InvocationId}" />
</target>
using (MappedDiagnosticsLogicalContext.SetScoped("InvocationId", Guid.NewGuid()))
{
    // "InvocationId" is present in current context
}

Alternative you can make use of ${activityid} and assign the InvocationId to Trace.CorrelationManager.ActivityId