serilog-contrib / Serilog.Sinks.Postgresql.Alternative

Serilog.Sinks.Postgresql.Alternative is a library to save logging information from https://github.com/serilog/serilog to https://www.postgresql.org/.
MIT License
67 stars 13 forks source link

[Feature] Add "event id" column or object #77

Open lonix1 opened 1 week ago

lonix1 commented 1 week ago

It's a good idea to store a hash of the message template, so one can easily search for similar events.

(By the way, Seq also does this.)

One computes the hash using one of these algorithms:

Then the hash must be persisted:

lonix1 commented 1 week ago

Here is a working implementation using the murmur3 algorithm.

First install package:

dotnet add package murmurhash

EventIdEnricher.cs:

using System.Text;
using Murmur;
using Serilog.Core;
using Serilog.Events;

namespace Serilog.Enrichers;

public sealed class EventIdEnricher : ILogEventEnricher
{

  public const string COLUMN_NAME = "EventId";

  public void Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
  {
    ArgumentNullException.ThrowIfNull(logEvent, nameof(logEvent));
    ArgumentNullException.ThrowIfNull(propertyFactory, nameof(propertyFactory));

    var eventId  = ComputeHash(logEvent.MessageTemplate.Text);
    var property = propertyFactory.CreateProperty(COLUMN_NAME, eventId);

    logEvent.AddPropertyIfAbsent(property);
  }

  public static string ComputeHash(string messageTemplate)
  {
    using var algorithm = MurmurHash.Create32();

    var bytes           = Encoding.UTF8.GetBytes(messageTemplate);
    var hash            = algorithm.ComputeHash(bytes);
    //var numericHash   = BitConverter.ToUInt32(hash, 0);       // alternative
    var hexadecimalHash = BitConverter.ToString(hash).Replace("-", "", StringComparison.Ordinal).ToLowerInvariant();

    return hexadecimalHash;
  }

}

EventIdColumnWriter.cs:

using NpgsqlTypes;
using Serilog.Enrichers;
using Serilog.Events;

namespace Serilog.Sinks.PostgreSQL.ColumnWriters;

public sealed class EventIdColumnWriter : ColumnWriterBase
{

  public EventIdColumnWriter() : base(NpgsqlDbType.Text, skipOnInsert:false, order:0) { }

  public EventIdColumnWriter(NpgsqlDbType dbType = NpgsqlDbType.Text, int? order = null) : base(dbType, skipOnInsert:false, order) { }

  public override object GetValue(LogEvent logEvent, IFormatProvider? formatProvider = null)
  {
    ArgumentNullException.ThrowIfNull(logEvent, nameof(logEvent));

    // I'm unsure whether it is guaranteed that value already
    // computed, so compute if necessary
    var alreadyComputed = logEvent.Properties.TryGetValue(EventIdEnricher.COLUMN_NAME, out var logEventPropertyValue);
    var eventId = alreadyComputed && logEventPropertyValue != null
      ? logEventPropertyValue.ToString()
      : EventIdEnricher.ComputeHash(logEvent.MessageTemplate.Text);

    return eventId;
  }

}

Config:

var columns = new Dictionary<string, ColumnWriterBase>
{
  { EventIdEnricher.COLUMN_NAME, new EventIdColumnWriter() },
  // etc...
};

services.AddSerilog((services, config) => config
  .Enrich.With<EventIdEnricher>()
  .WriteTo.PostgreSQL(/* ... */);
lonix1 commented 4 days ago

IGNORE THE PREVIOUS IMPLEMENTATION

This is the one I now use, which is simpler and does not require external dependencies.

Reference hashing algorithm

Reference either Serilog.Formatting.Compact or Serilog.Expressions; that isn't necessary if one is using Serilog.AspNetCore as it already depends on the former.

An alternative is to simply copy/paste the algorithm as it is really simple.

Create EventIdEnricher.cs

using System.Globalization;
using Serilog.Core;
using Serilog.Events;
using Serilog.Expressions.Compilation.Linq;

namespace Demo;

public sealed class EventIdEnricher : ILogEventEnricher
{

  public const string PROPERTY_NAME = "EventId";

  public void Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
  {
    ArgumentNullException.ThrowIfNull(logEvent, nameof(logEvent));
    ArgumentNullException.ThrowIfNull(propertyFactory, nameof(propertyFactory));

    var eventId  = ComputeHash(logEvent.MessageTemplate.Text);
    var property = propertyFactory.CreateProperty(PROPERTY_NAME, eventId);

    logEvent.AddPropertyIfAbsent(property);
  }

  public static string ComputeHash(string messageTemplate)
  {
    ArgumentNullException.ThrowIfNullOrWhiteSpace(messageTemplate, nameof(messageTemplate));
    return
      EventIdHash.Compute(messageTemplate)             // compute numeric hash
      .ToString("x8", CultureInfo.InvariantCulture);   // convert to hex string: https://github.com/serilog/serilog-formatting-compact/blob/8472ad8ccb97432ca7efbe78d8bc0eaf61db5356/src/Serilog.Formatting.Compact/RenderedCompactJsonFormatter.cs#L72
  }

}

Optional: create column writer

If you want to write the hash to a dedicated column, use the EventIdEnricher.cs I showed in the previous implementation. It works, but I don't use it anymore.

One benefit of doing this is that the column can be indexed. But on our system we kept it simple, and the log table has only two columns: a PK and the json-encoded log data.

Configure

services.AddSerilog((services, config) => config
  .Enrich.With<EventIdEnricher>()
  .WriteTo.PostgreSQL(/* ... */);