datalust / seq-tickets

Issues, design discussions and feature roadmap for the Seq log server
https://datalust.co/seq
97 stars 5 forks source link

Ingesting a stream of events via HTTP over a long period of time #2013

Closed bekir-ozturk closed 12 months ago

bekir-ozturk commented 12 months ago

Background

Hi there :wave: I am working on a small-feature-set but high-performance logging&ingestion library for seq. I am using a simple HTTP connection to ingest the logs as described here. I do very limited buffering on the library level and would like to instead rely on the TCP send buffer for queuing any logs that are not yet delivered. Every call to my 'Log' method directly writes to the socket stream underneath the HTTP request.

Since I don't do buffering on the application level, I can't put multiple log events into a single request (where each event is in a new line) and send it to seq. Each log has to be sent separately. But this means that I have to send request headers for every single log event.

To avoid the cost of making a full-sized HTTP request with every log event, I decided to use Transfer-Encoding: chunked instead. This ends up sending Seq the data in the following way:

The problem

This all looks well, but there is a big issue: seq won't ingest any of the data I sent until I complete the request. So all the logs that were delivered to seq are at risk of being lost if my application crashes and I'm unable to properly close the connection. It is also not possible to see or query any of the logs in seq until the application finally decides to close the connection. Finally, I get no feedback from seq during this time as the response is only sent once at the very end.

Questions

Can ingestion via HTTP be improved:

nblumhardt commented 12 months ago

Hi @bekir-ozturk! Sounds like a really cool project. Unfortunately it won't work well with Seq as it's implemented today, because timeouts and DOS prevention measures built into the server may end up dropping your open connections unexpectedly.

WebSockets might be a closer protocol fit for this; Seq implements streaming out via WebSocket today, but streaming in also seems reasonable and we could definitely consider it for a future release. Any thoughts?

Another (unfortunately much more complicated) option would be to use OTLP and gRPC, which Seq already supports. Although the abstraction is request-response, I believe that under the hood, HTTP/2 maintains the open connection and streams requests in a similar manner to what your client is doing explicitly.

bekir-ozturk commented 12 months ago

Hi @nblumhardt, Thanks for the quick reply!

Websockets would be great. Is there a ticket that I can use to track the status of such feature (even if it might get rejected)?

gRPC is probably a good solution to this in many cases, but I want to avoid it for this specific project due to its complexity as you also pointed out. I am not familiar with OTLP, but it seems similar to gRPC in this regard.

Thanks!

nblumhardt commented 12 months ago

Thanks for your reply. Converting this ticket to a discussion is probably the best way to track it - we try to keep the main issue tracker fairly lean (there are a lot of things we could potentially consider doing :-)) so for more speculative things, we tend to use discussions to keep the conversation alive.

I'll do the conversion now :+1: