pinojs / pino-socket

🌲 A transport for sending pino logs to network sockets
43 stars 14 forks source link

Possibility to use transport as a destination (Lambda context) #9

Closed OlivierCuyp closed 5 years ago

OlivierCuyp commented 5 years ago

Hi, we are trying to send our logs from a Lambda function directly to our Logstash. From what I read the transport are designed to be used as separate thread for performance purpose. In Lambda we only have one process. So I was thinking that maybe this transport could be split in a pino destination or a transport. I maybe completely mislead, but I don't see how else we could achieve our Lambda logging... Any advice would be welcome.

jsumners commented 5 years ago

This transport is designed for use in traditional server environments. If you want to send logs from AWS Lambda, you should have another Lambda that receives CloudWatch log events, parses them, and ships them to your destination.

OlivierCuyp commented 5 years ago

@jsumners I understand that transports are not designed to be a Pino destination. For now, we use Winston with the Http transport (which works as destination), but it is obviously not optimal. We thought it could be a better solution to use Pino with a tcp socket destination to ship logs to our Logstash.

I read about using a Lambda to ship logs to Logstash but this can have some side effects as mention in this article:

https://theburningmonk.com/2018/07/centralised-logging-for-aws-lambda-revised-2018/

For the better solution proposed by the article, we have no knowledge in Kinesis...

But from you experience, building/using a Pino destination to ship logs through a tcp socket is not a good idea ?

jsumners commented 5 years ago

I’m saying it’s not a possibility according to the knowledge that I have.

mcollina commented 5 years ago

I would recommend writing a custom stream that does just that, this is possibly a new module. However something as simple as the following could work:

const socket = require('net').connect(12345, 'something.com')
const logger = pino(socket)
logger.info('hello world')

However you'll need to handle reconnection because lambda functions could be suspended at any time.

OlivierCuyp commented 5 years ago

@mcollina thanks, something like that I was seeing. I'll make public repo with my attempt.

OlivierCuyp commented 5 years ago

Since you guys took care of most the reconnection issue and the buffering on the tcp. I just took the 2 libs and exposed a function that instantiate one or the other. This is just to show the idea but it 99% the same code:

https://github.com/OlivierCuyp/pino-socket-destination

I didn't have the time to test it but I believe it should work as a valid destination for pino. I'm still wondering on the default values of reconnect to false and reconnectTries to Infinity in the context of Lambda.

jsumners commented 5 years ago

Maybe submit a PR to this repo that adds an API instead of creating a new module?

I think it is generally a bad idea to handle shipping logs in this manner on AWS, though. Particularly in the context of a Lambda. Lambdas should be simple request processors. They shouldn’t be dealing with the request and log shipping at the same time. I am quite certain you will encounter issues doing so. However, I don’t have a better solution for handling logs in AWS than the one I have already mentioned.

OlivierCuyp commented 5 years ago

@jsumners & @mcollina I'm not sure using the libs of this module is the best idea. So I took some more time to play around with streams ... I tried to wrap a Writable stream that could handle reconnection without loosing message during connection loss.

In the end, I came to this :

const { Writable } = require('stream')

class TcpDestination extends Writable {
  constructor (options = {}) {
    super(options)
    this.host = options.host || 'localhost'
    this.port = options.port || 514 // Default Syslog port
    this.maxRetries = options.maxRetries || 0
    this.retryDelay = 1
    this.retryPreviousDelay = 0
    this.retryCount = 0
    this.socket = new net.Socket()
    this.queue = []
    this.connecting = false
    this.connected = false

    this._connect()
  }

  _write (chunk, encoding, callback) {
    this.queue.push({ chunk, encoding })

    if (this.connected) {
      this._send()
    }

    callback()
  }

  _connect () {
    this.connecting = true
    this.socket.connect(this.port, this.host)
    this._addListeners()
    // Detach from event loop to avoid process hanging
    this.socket.unref()
  }

  _reconnect () {
    const delay = this.retryDelay
    if (++this.retryCount < this.maxRetries) {
      setTimeout(() => {
        this.socket.removeAllListeners()
        this._connect()
      }, delay)
      // Fibonacci sequence
      this.retryDelay += this.retryPreviousDelay
      this.retryPreviousDelay = delay
    }
  }

  _addListeners () {
    this.socket.once('connect', () => {
      this.connecting = false
      this.connected = true
      this.retryCount = 0
      // Empty potential messages queued
      this._send()
    })
    this.socket.once('error', (_err) => {
      this.connected = false
      // Whatever the error is, we'll retry
      this._reconnect()
    })
    this.socket.once('end', () => {
      this._reconnect()
    })
  }

  _send () {
    while (this.queue.length) {
      const { chunk, encoding } = this.queue.shift()
      this.socket.write(chunk, encoding)
    }
  }
}

It is not prefect yet, but I played with it and it does the job. Do you think I should put more effort to make it a module or this is too far from the Pino philosophy ?

jsumners commented 5 years ago

I'm sorry. I do not have time to decode the class syntax into something I can parse and then dig through the logic.

davidmarkclements commented 5 years ago

@OlivierCuyp why not use a module that handles the reconn complexity? (like https://www.npmjs.com/package/reconnect-net)

I'd definitely like to see Pino have a strong opinion on lambda integration, particularly when it comes to transports. Thus far the advice has been a bit conflicting - but I'd encourage a separate module for now as you experiment to find the optimum strategy. I'll be happy to review your code if you ping on this thread, and if it's something we find works as a general solution to pino + lamda (and pending @mcollina and @jsumners inputs) I'd love to then bring your module as an official pino module.

jsumners commented 5 years ago

@davidmarkclements it'd have to be in-process "transports" if there were going to be a way to support them.

davidmarkclements commented 5 years ago

Yeah .. with pino multistream ?

On Fri, 30 Nov 2018 at 00:39, James Sumners notifications@github.com wrote:

@davidmarkclements https://github.com/davidmarkclements it'd have to be in-process "transports" if there were going to be a way to support them.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/pinojs/pino-socket/issues/9#issuecomment-443034939, or mute the thread https://github.com/notifications/unsubscribe-auth/ABIrPIvwhc87vH6blzRgkFXxHZ_E-Eelks5u0HAUgaJpZM4Ytkd5 .

jsumners commented 5 years ago

@davidmarkclements it is my opinion that none of this belongs in an AWS Lambda function. I think that the second you reach for a long lived socket then you have broken the utility of Lambda. Lambdas should receive data, process it, and return a result. That's it.

davidmarkclements commented 5 years ago

Ok makes sense. What recommendation should pino make regarding lamda logging? Can logs simply be collected from stdout using lamda infra?

On Fri, 30 Nov 2018 at 01:14, James Sumners notifications@github.com wrote:

@davidmarkclements https://github.com/davidmarkclements it is my opinion that none of this belongs in an AWS Lambda function. I think that the second you reach for a long lived socket then you have broken the utility of Lambda. Lambdas should receive data, process it, and return a result. That's it.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/pinojs/pino-socket/issues/9#issuecomment-443042499, or mute the thread https://github.com/notifications/unsubscribe-auth/ABIrPL-tkGPAkU8oaNoZ7z9P22e1FFQ9ks5u0HhpgaJpZM4Ytkd5 .

jsumners commented 5 years ago

Not “simply” but the article linked earlier in this thread describes how I deal with the situation (just not using Kenesis yet).

jsumners commented 5 years ago

Now this morning I have learned that AWS has added support for custom runtimes — https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom.html

In theory, this could be used to run transports in the Lambda environment.

OlivierCuyp commented 5 years ago

@jsumners & @davidmarkclements sorry for the delay. I have been a bit busy these days. Since Pino can take any Writable stream as destination I could use reconnect-net, I just need it to be unref so the process doesn't hang. Anyway, I believe I can do more simple than wrapping a socket stream in a write stream. I'll make some testing and let you know on this thread. For now, feel free to close it.

OlivierCuyp commented 5 years ago

On the implementation, I'm a bit struggling with this issue:

https://github.com/nodejs/help/issues/1645

OlivierCuyp commented 5 years ago

For people reading this issue, we end up using udp. It is easier to manage (since connectionless) and faster than tcp and Logstash handles it.

helper:

const dgram = require('dgram');
const { Writable } = require('stream');

const udpDestination = (address, port) => {
  const socket = dgram.createSocket('udp4');

  return new Writable({
    final () { socket.close(); },
    write (data, encoding, callback) {
      socket.send(data, 0, data.length, port, address);
      callback();
    }
  });
};

module.exports = udpDestination;

usage:

// ...
const destination = udpDestination(somePort, someAddress);
const logger = pino(destination);

@jsumners feel free to close the issue

OlivierCuyp commented 5 years ago

Additional info, I had to add socket.unref() in the helper to avoid process hanging...