bathos / logentries-client

Logentries client (JS)
Other
3 stars 2 forks source link

Bad DNS or Bad API tokens cause silent failure, infinite queue growth #4

Closed markstos closed 2 years ago

markstos commented 9 years ago

I created the below script as "t.js" in the project directory to test what happens with a Bad API token and also what happens when DNS fails or the LogEntries service is down. To simulate the latter, I added an entry to my local /etc/hosts that maps the DNS calls to "127.0.0.1". I noticed some undesirable things happen:

  1. Despite immediately registering an error handler, no error is generated to tell me that my token is bad.
  2. The Bad DNS / failed generate an error either.
  3. Meanwhile, logging appears to be working and internally queued in a JavaScript array. I see no bounds on the array growth in the code, so while the code continues to silently fail, the Node process memory growth would continue to grow without bounds until the process crashed, or the OOM killer decides to kill some other process to save the OS itself from crashing.
  var LogentriesClient = require('./');

  // Here I am, giving you a bad token
  var logger = new LogentriesClient({
      token: 'BOOM'
  });

  // I immediate set up logging
  logger.on('error', function (error) {
    console.log("Oops, there was an error logging. %s",error);
  });

  // Now I'm sending some logs which should never go through
  console.log("before warning");
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
>>logger.warning('The kittens have become alarmingly adorable.')
  console.log("after warning");

  // To give async tasks the benefit of a doubt, we pause a bit before quitting.
  // By this point, I would expect an error to be emitted.
  setTimeout(function () {
    process.exit();
  },5000);

To check the queue growth, I also patched the code like this to check the queue size:

diff --git a/lib/logger.js b/lib/logger.js
index 7b9cb14..a3db71b 100644
--- a/lib/logger.js
+++ b/lib/logger.js
@@ -152,6 +152,8 @@ Object.defineProperties(Logger.prototype, {
                value: function(log) {
                        this._queue.push(log);

+      console.log("Current queue length is: "+this._queue.length);
+
                        if (this._socket == 'connecting') return;

                        if (this._state == 'connected') return this._processQueue();
bathos commented 9 years ago

Good catch. The logic pattern for the net connection and message queue were lifted from le_node without much examination. Reviewing it, I realize the whole thing ought to be a write stream first of all, rather than reinventing the concept "part way". Then the only queue would be the native internal stream buffer, which will have a high water mark etc.

As for the tcp socket not passing error events to the Client object, that should be an easy fix. One thing I'm unclear on -- I really don't know network traffic -- is whether there's any means for us to confirm that a given chunk was sent successfully. My impression is that there really isn't -- it appears that all you can know is if the connection is open, authorized, failed, or disconnected.

markstos commented 9 years ago

Thanks for the fast response. I'm not sure about how to check for the successful data write either. My plan would always be to log both locally and remotely, so I have two chances to succeed.

bathos commented 9 years ago

I'll begin a more thorough review of the library structure on an upcoming weekend. This is the main area where it's lacking but it shouldn't be tough to improve on -- I just want it to be solid. So my plan is that the fixes for these issues -- and the refactoring necessity that they really hint at -- will bring us to v 1.0.0.

Feel free to open issues for any features or improvements you'd like to see, btw -- I'm in a vacuum over here.

bathos commented 9 years ago

I've begun the overhaul. A few things I've noticed so far:

Connection errors already were emitted on the client; I just didn't remember those handlers get registered in a method on the client and not in logger-connection.js. So I'm not really clear on why your error handler was not invoked. An error event occurs correctly when the DNS is bad for me, so I have to ask -- is it possible that your local port 80 was open? If so, there'd be no way to know it wasn't the correct endpoint. If not, perhaps you can tell me more about the test you did that produced this result.

However there were certainly problems with how the connection was handled. In the event of a timeout, which for some reason I had set to an overlong three minutes, we were retrying but without a proper exponential backoff. Plus, no matter how many times it timed out, it wouldn't have ever given up or emitted an event to indicate that there was a problem. This will be fixed. Timeout is for inactivity on the socket I just remembered, not connection timeout. The three minutes was intentional.

As for invalid tokens, I don't have good news. I've confirmed that there's never any response, not even if the data sent was garbage -- so the only way to discover that your token is bad is to not see logs coming through on Logentries. This isn't really acceptable since it means you can't test whether your logging system is actually working. I'm going to investigate whether there's some other data-returning endpoint that we could exploit to do confirmation of token validity before the first transmission attempt. But I think this might be a case where we need to open a new Trello ticket for the backend guys.

In the meantime I've extended the initialization sanity checks to include a regex pattern test on the token; they have a rather specific format so it's at least possible to rule out nonsense strings.

markstos commented 2 years ago

I presume this project is no longer in use and can be archived.

bathos commented 2 years ago

Yep. Well, at least in this location? I didn't realize it still existed under my name.

LogEntries was my first tech job - I haven't worked there in ~seven years. Context for this repo was that I'd originally made this package for a node project I was working on there myself. I hit various issues with what (at that time) had been the official logentries node client; guessing I was the first person there to try using it internally. To get stuff working despite that, I created this for that project specifically. Some months later it ended up being robust enough (well ... allegedly; I wouldn't bet on it now) to subsume / replace the previous "official" package. It should have been transferred to the org at that time, but I think maybe it got forked or something instead so this one's still sitting here.

I'd be surprised if it hasn't been rewritten (either literally or via sea change) again since that time, maybe more than once. I'm slightly afraid to even peek at the code here given how green I was.