Closed fabiokung closed 11 years ago
/cc @bgentry @JacobVorreuter @archaelus
Sorry, the PR is not ready yet. Reading the code more I realized that this would require some big changes on the way log lines are being read.
Right now it is line oriented (which requires a stream of potentially multiple packets). Very long lines can easily fill buffers though.
With unixdgram
, each log would be limited to the size of a datagram, but logs could be potentially truncated. What would be better?
Why not make this a different program?
I think the log-shuttle project is ideally about three different things: 1) a logger replacement (stdio<->logplex) 2) a syslog<->logplex gateway (to be a companion to syslog-ng on kernel instances) 3) a logplex<->syslog gateway (for clients to run on their own machines to integrate with their existing logging infrastructure)
I would build these as three different programs maybe sharing some libraries.
apparently SOCK_DGRAM
sockets will yield ENOBUFS
for large messages, which will potentially crash processes trying to send very long log datagrams. That seems to be desirable.
Another thing I just realized is that if we switch to be datagram based, we can easily support multi-line logs. Processes can just send datagrams with multiple lines to log-shuttle
.
I would build these as three different programs maybe sharing some libraries.
I am sure that @ryandotsmith is thinking about it, but this seems to be a different problem (or am I missing something?). Even if they were different programs, we would still need to decide how we read log lines from sockets/pipes. Datagram or stream/line based?
The difference is kinda the input format. The logger program reads messages delimited by new lines (you could also have a byte-count-framed syslog input mode I guess), the Datagram thing reads messages in datagrams.
If you are connecting syslog-ng to this program, then I would ask for byte-count-framed messages in a stream: it lets you do multi-line messages (something hermes uses). (And multi-line messages are the future)
I am going to close this PR for now. Not saying that the idea is dead, just not going to move on it in the short term.
The
unix
connection type innet.Listen(...)
means a unix socket of typeSOCK_STREAM
(source code here).By its definition,
SOCK_STREAM
avoids duplication, loss and guarantees the order of messages. It can generateSIGPIPE
,ETIMEOUT
and other errors described here.SOCK_DGRAM
is a simpler type of socket (more similar to a simple message queue). It doesn't provide any strong guarantees (messages can be delivered out of order, dropped or duplicated), but seems to be good enough for our logging purposes. Plus, it would avoid problems in the log pipeline affecting or blocking dynos. It's analogue to UDP sockets: fire and forget.In practice, it seems that the implementation on Linux will not deliver messages out of order anyway.
More info:
unix(7)
andsocket(2)
.