datalust / seq-tickets

Issues, design discussions and feature roadmap for the Seq log server
https://datalust.co/seq
96 stars 5 forks source link

Skip `coalesce(@m, @mt)` compatibility fix on systems without historical data #2170

Closed nblumhardt closed 5 months ago

nblumhardt commented 5 months ago

Between late 5.1 and early 2020.1 Seq versions, a storage optimization was used that skipped storing an event's @m message property in the back-end event store, if its value was identical to the @mt message template property.

This saved some storage space, but it was quickly noticed that using sparse deserialization, having to load both @m and @mt in order to inspect an event's message was a poor trade-off. The optimization was reverted, but queries continued to map the Seq @Message property to coalesce(@m, @mt) because data already ingested forced this pattern.

You can see this by running:

explain lower select @Message from stream limit 1

In 2024.3, we'll finally retire this compatibility mode, using a migration that checks the bounds of the event store, and compares this with the date that the optimization was reverted (knowable because we timestamped this using a migration in 2020). Systems with data ingested using an affected version will continue to use the coalesce(@m, @mt) compatibility mode, while the rest (likely the vast majority, now that we're four years down the track) will map @Message to plain @m.

On queries with predicates based on @Message, we see approximately 5% speed-up after applying this.