fluent / fluent-logger-node

A structured logger for Fluentd (Node.js)
Apache License 2.0
259 stars 83 forks source link

Queue Message until Reconnection Complete #93

Open st3xupery opened 6 years ago

st3xupery commented 6 years ago

I am using fluent-logger-node in a docker swarm environment which is to say the cluster of services operates within their own private network and I have one node service communicating to fluentd container within this internal network.

The problem I face in all socket situations (e.g. database connections) is that sockets seem to close after a period of time that only becomes evident to the service after a failed connection attempt, I assume it's a docker swarm problem but that's beside the point. The error as output to me when this socket close occurs is an ECONNRESET error. Which I have learned translates to the other side of the TCP conversation abruptly closed.

In the case of the pooled db connections I have access to the socket object and enable the keepAlive setting to all pools, but even if I forked this repo and exposed the socket I don't want to have to utilise this technique. Rather it would seem more elegant to queue emitted messages until a reconnection is established,

I've looked to the .on error event which I would assume would trigger in the event of a failed connection, but I don't think it would be sensible to bind an error event to every message I emit as I don't seem to have the capacity to turn those events off on a success callback.

Also, does the callback arg in emit method pass an error message in the event of an err? Or does the callback just fire regardless of message success and provide no value?

I was also thinking of setting the timeout to be something short and frequent so that the socket gets refreshed regularly before the docker swarm network shuts the socket but I had the value to 3.0 initially and my emits would still be hit with an ECONNRESET.

Any advice?

okkez commented 6 years ago

I'm not familiar with the docker swarm environment.

Also, does the callback arg in emit method pass an error message in the event of an err? Or does the callback just fire regardless of message success and provide no value?

It depends on the situation. For example, callback argument will include error messages when ECONNREFUSED occurs.

I was also thinking of setting the timeout to be something short and frequent so that the socket gets refreshed regularly before the docker swarm network shuts the socket but I had the value to 3.0 initially and my emits would still be hit with an ECONNRESET.

Do you need the option that periodically reconnection or something?

Any advice?

The queue of fluent-logger-node does not improve robustness, it is for supporting Fluentd forward protocol v1.

BTW, I will consider the unstable network environment (internal IP address will be changed often) like docker swarm or k8s pod.