Open nblumhardt opened 2 years ago
Yesterday, I went into this issue of the batch limit reached on the seq server side. It is quite easy to restrict thè size per message, but impossible to control the batch size, except if we send 1 by 1 message, which I think is completely inefficient.
I could see that version 4 of serilog removed PeriodicBatchingSink
, maybe this makes this CR possible now?
Definitely keen to explore this again; thanks for the nudge.
Do have by chance any feedback for this enhancement ? (this becomes quite important for us.)
Hi!
We've looked back into this a couple of times, but we're not really happy with any of the implementation options at present.
Even duplicating the batching infrastructure to add this natively, there are some significant performance considerations to look at because the buffer between foreground and background threads will need to carry large string allocations that the sink can avoid making as things stand today.
If an additional batchSizeLimitBytes
argument were added today, I think the best we could do would be to break any oversize batches into chunks, and to send each chunk in a separate request. Once the first chunk in a batch was sent successfully, we'd have to ignore failures trying to send any later chunks, otherwise we'd risk duplicating logs (the Serilog batching infrastructure doesn't support partial retries).
This could be a stepping stone to a better solution in the future - would it move things forwards enough for you for this to be worthwhile, @spointeau?
Hi,
I definitely cannot agree on potentially loosing logs. The best approach is to change the logic when making the batches: in addition to the counting, the batch total size should be less or equal than the specified size. I did not look in code specifically but I don't see any other option. Do you think it is feasable?
Thanks for your reply. It's definitely not out-of-the-question, but I don't think there's a clear/straightforward way to proceed right now. Perhaps forking, trimming down, and modifying the sink to create an experimental/companion implementation would be a possibility?
Hi, When you propose of forking, trimming down etc, is it for me to have a look right ? Please could you just point me on the source file(s) where I can (try to) understand this logic of batching to the background thread? I am not sure if I will be helpful to you but I will try to better understand the whole change. I will do my best in any case.
Thank you very much, I had a look and I definitely think that the change should happen in BatchingSink.cs. I will try to submit you a basic prototype asap. The only thing (for now) that I am missing is how to calculate the size of an event (for the batch to not exceed a specified size)?
Hi! That's the tricky part, and why I think it'd need to be built as a separate implementation for the time being :+1:
Since determining the event's size depends on its UTF-8 encoded, Seq-formatted representation, I think the trick would be to:
BatchingSink
into the Seq sink projectbyte[]
sIReadOnlyCollection<byte[]>
batches instead of the current LogEvent
-based onesTo make the whole process easier, deleting the other Seq sink variations from the forked repo would save having to update them (the "trimming down" bit).
I guess we are in a catch-22, because the batching is made on serilog but doing it by the size depends on the seq sink formatter. I really missed the elephant in the room. So there is no other choice than to do the sub batching in the seq sink, as you proposed initially.
I am still missing the following: How you do retry if the ingest is failing ? Could you explain a bit more the foreground and background threads ? is the batching done on the foreground thread and the seq sink on the background thread ? I don't see it clearly.
:+1: either that, or copy the batching code from Serilog into the Seq sink and modify it there (using this instead of Serilog's native batching). I'm not so keen on maintaining another fork of the code, and the result would have different performance characteristics, but spiking it out and testing/profiling might provide some data to push us one way or the other.
https://github.com/serilog/serilog/blob/dev/src/Serilog/Core/Sinks/Batching/BatchingSink.cs#L120 is the key to how retries work; the _currentBatch
collection isn't cleared when a batch fails, so the next time around the loop (after the inter-batch delay) the same events will be tried again.
On the server-side, Seq imposes two size limits on incoming events:
POST
body.On the client side, this sink currently uses an event size limit, but batches are limited by count (
batchPostingLimit
) rather than byte size. This makes it difficult to reliably align the client and server settings, leading to potential batch loss when oversize batches are discarded, or poor batch utilization if the batch size has to be set to an excessively conservative low value.A future version of this sink should replace
batchPostingLimit
with a size-based control.The main caveat that will complicate this is the dependency on
PeriodicBatchingSink
, which implements the current count-based logic. The dependency onPeriodicBatchingSink
will likely need to be broken and the functionality replaced in order to make the proposed change here.