pinojs / pino-socket

🌲 A transport for sending pino logs to network sockets
43 stars 14 forks source link

backpressure on TCP #4

Closed mcollina closed 8 years ago

mcollina commented 8 years ago

currently pino-socket is not handling backpressure over TCP, not a big deal, but memory might grow quite a bit.

I'm not sure what we should do in case, meaning at some point that data will be buffered up, and we probably do not want that to happen.

jsumners commented 8 years ago

Memory is definitely a concern. The app sending logs shouldn't suffer because of the log transport. As someone who uses TCP for logging, which would you rather have:

  1. A known buffer size that drops incoming logs when it is full (the FIFO buffer honors the "First In")
  2. A known buffer size that drops outgoing logs when it is full (the FIFO buffer drops the first log in the queue to add the new one at the end of the queue)

In either case the buffer should be adjustable, both in size and if it's even enabled.

I imagine this scenario really only shows up if you're sending to a remote destination. One solution, that doesn't require re-inventing any wheels here, might be to forward to a local queueing server that will de-queue to the remote destination, e.g. some sort of RabbitMQ setup.

Dropping this here for reference -- https://gist.github.com/skeggse/6802812

mcollina commented 8 years ago

@jsumners I would rather drop things at the beginning of a queue, a "circular buffer" type of implementation.

A solution for the "local queue", is to use something like level, but that is definitely not part of this library.

jsumners commented 8 years ago

Try out the code currently in the develop branch. It may be pushing stuff into a queue faster than it can dequeue them, but I'm not sure. I haven't come up with a test that confirms it.

mcollina commented 8 years ago

Avoid using bind(this) on those transports, bind is incredibly slow.

You probably want to do the reverse for https://github.com/jsumners/pino-socket/blob/develop/lib/TcpConnection.js#L62-L68, using tcp-ping first and then connecting.

I don't think that's a correct implementation, IMHO you should have that only inside TCP. Also the code does not exhaust the queue when a disconnection happens.

jsumners commented 8 years ago

The bind(this) only happens during initial startup. It shouldn't even be noticeable.

The connect method is called to start even the first connection. So doing the probe first would slow down startup. The probe is in place to make sure the connection succeeded. The operation order, as I understand net.Socket events is:

  1. Create initial connection a. Verify that connection succeeded b. Reconnect if (a) is false
  2. Wait for an end event that signals the remote system died. Repeat (1)
  3. Wait for a close event that signals the local system hung up. Repeat (1)

I thought about passing the queue into the transport and letting the transport handle the queue. That would allow queuing to be a TCP only feature. But I figured that would be overly complex when the queue won't even affect UDP in a significant manner. Am I wrong?

If the queue needs to be wiped when a disconnect occurs, then the would definitely need to be managed by the individual transports. I'm not sold on this, though. In my opinion, the queue should be accepting messages to send once the connection comes back online. Although, it's this part that I don't think is actually happening right now.

I'm very open to suggestions, or code, on how to handle this. This is the scenario I had hoped to avoid in by initially designing it to be a simple pass through.

jsumners commented 8 years ago

Uh, yeah, I realized what you're saying about tcp-ping. I decided it's completely pointless and have removed it altogether.

mcollina commented 8 years ago

Regarding bind(this), is not the actual bind call that it is slow, it is the binded function.

We might have drifted a bit from the goal, which is not to provide buffering, but rather implement backpressure / flow control by discarding events that happens in the middle, and then work around the "circular buffer" issue, handling small downtime in the connection.

  1. establish a new connection
  2. pipe the origin stream into the connection stream with { end: false }
  3. if the connection fails, then call resume() on the origin stream
  4. when the connection comes back, call pause() on the origin stream and go back to 2.

Step 3 can be delayed by 5-10 seconds to avoid losing too many lines.

jsumners commented 8 years ago

I'll look into that. I can see some other improvements given that methodology.

jsumners commented 8 years ago

Okay, pino-socket now directly pipes the incoming messages to the destination socket. Silly buffering and CEE formatting has been removed (turns out more was needed for my end goal with that anyway). Currently, it's pausing stdin while a reconnect is being attempted and resuming it on success. I good way to delay pausing isn't jumping out at me right now.

The benchmark script should be ignored; it may be scrapped altogether.

This is all still only on the develop branch.

mcollina commented 8 years ago

Some suggestions

  1. don't pause stdin in https://github.com/jsumners/pino-socket/blob/develop/lib/TcpConnection.js#L78
  2. you can avoid piping and inheriting from Writable, by calling input.pipe(dest, { end: false }), and when dest is disconnected, you can input.unpipe(dest), and then call input.resume() to avoid blocking. When the connection comes back, you can call input.pause(), and then input.pipe(dest, { end: false })to restart the cycle
  3. maybe you can just tcpConnection(dest) or udpConnection(dest).
jsumners commented 8 years ago

That clears up a suspicion I had about your last post -- dest is literally the connection. There is no way to pause it on disconnection because at that point it no longer exists. Which means the only stream available for pausing/resuming is stdin (and I'm not a fan of pausing it either). So I'm not sure what the solution is to this problem. It'd be very nice to rely on the streams API internal buffering, but I don't see how it can be accomplished.

mcollina commented 8 years ago

@jsumners build a small prototype just with streams, there is some complexity here that makes things harder than they could be to try this out. You want to build the Writable side of of https://github.com/mcollina/never-ending-stream (discarding data that is being pushed through when the destination is offline).

jsumners commented 8 years ago

Would you have time to submit a PR with the solution you have in mind?

mcollina commented 8 years ago

Not for a couple of weeks :(. I'm over-committed right now.

jsumners commented 8 years ago

Understood. I'll try to get to it before then, but I'm a bit swamped right now too.

jsumners commented 8 years ago

@mcollina give commit 296a9e1 a try.

jsumners commented 8 years ago

@mcollina regarding your line note, here's what is happening with the tcp connection:

process.stdin 
  ||
  ||
  \  ===> inputStream (a PassThrough)
  ||             ||
  ||             \  ===> outputStream (a Writable)
  ||
  \  ===> process.stdout (or whatever else it was piped to)
  ...
  ...

The inputStream is paused and resumed with disconnects and reconnects. The outputStream writes whatever comes into it to the tcp socket. The socket points to whatever the current connection is, and does not get written to while inputStream is paused.

That's what I intend to be happening, anyway.

mcollina commented 8 years ago

The correct model for all of this would be to not have intermediate stream, but rather pipe / unpipe the original one. You will avoid a lot of overhead.

Sorry for the late reply, but I am traveling these days.

jsumners commented 8 years ago

The original stream is process.stdin. We both agree that pausing that stream is a bad idea. And pausing a stream is the only way to induce buffering so that data is less likely to be lost while the connection is re-established.

At this point the code is written to the best of my ability without becoming overly complex. If it doesn't solve the problem, or can be done better, I will have to wait on a PR. Otherwise I am willing to brand this as 1.0.0 and push it out.

mcollina commented 8 years ago

Do it :). Il giorno mer 7 set 2016 alle 17:55 James Sumners notifications@github.com ha scritto:

The original stream is process.stdin. We both agree that pausing that stream is a bad idea. And pausing a stream is the only way to induce buffering so that data is less likely to be lost while the connection is re-established.

At this point the code is written to the best of my ability without becoming overly complex. If it doesn't solve the problem, or can be done better, I will have to wait on a PR. Otherwise I am willing to brand this as 1.0.0 and push it out.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jsumners/pino-socket/issues/4#issuecomment-245264124, or mute the thread https://github.com/notifications/unsubscribe-auth/AADL4z0rq6GS0AAaXGS36McaAPkbtTS0ks5qnq05gaJpZM4I0nuc .

jsumners commented 8 years ago

1.0.0 is published. Thank you for the feedback and patience on this issue.