alaingilbert / Turntable-API

Allows you to create bots for turntable.fm
http://alaingilbert.github.com/Turntable-API/
MIT License
318 stars 97 forks source link

Make disconnect detection more robust #225

Closed gizmotronic closed 11 years ago

gizmotronic commented 11 years ago

@DubbyTT's addition of error handling addressed a major reason why it was hard to handle disconnections. These changes address some of the situations that I discovered come up when working on my own similar changes:

What follows is a very simple automatic reconnect example that doesn't require a watchdog.

var Bot    = require('ttapi');
var AUTH   = 'xxxxxxxxxxxxxxxxxxxxxxxx';
var USERID = 'xxxxxxxxxxxxxxxxxxxxxxxx';
var ROOMID = 'xxxxxxxxxxxxxxxxxxxxxxxx';
var bot    = new Bot(AUTH, USERID);
var error  = false;

var connect = function (roomid) {
  error = false;
  console.log('connect');
  bot.roomRegister(roomid, function (data) {
    console.log('registered = ' + (data.success ? 'true' : 'false'));
  });
};

var onError = function () {
  if (!error) {
    error = true;
    console.log('onError');
    setTimeout(function () {
      connect(ROOMID);
    }, 5000);
  }
};

bot
  .on('ready', function (data) {
    console.log('ready');
    connect(ROOMID);
  })
  .on('roomChanged', function (data) {
    console.log('roomChanged', data.room ? data.room.name : 'unknown');
  })
  .on('error', function (data) {
    console.log('error', data);
    onError();
  })
  .on('speak', function (data) {
    if (data.text.match(/^\/hello$/)) {
      bot.speak('Hey! How are you @' + data.name + '?');
    }
  });
ghost commented 11 years ago

DubbyTT's code looked fine the way it was, it didn't have to be changed.

technobly commented 11 years ago

@gizmotronic I think you have presented your changes well. I like some of what you've done, but I don't think this change is necessarily better than what we already have. In fact I prefer what we have for several reasons.

gizmotronic commented 11 years ago

Greetings,

Regardless of all of this, there are several things addressed here that I feel are quite important.

  1. The interval timer must be cleared in roomRegister. If you don't, you will create a new interval timer each time you register in another room.
  2. The error handler should be cleared before closing an existing websocket in roomRegister. If you don't, you'll emit a spurious error event in a variety of common circumstances.
  3. An error should be emitted when the server responds with something that JSON.parse() can't interpret. This frequently happens during both planned and unplanned Turntable outages.
  4. The whichServer http error handling as it currently stands just logs a message instead of relaying the Error object up the chain. It's useful to be able to get a backtrace when debugging issues. This also means that you don't need to wait for a watchdog timer to fire if you want to handle it right away.
  5. For advanced users, it's handy to be able to configure the presence ping time and, provided this change is accepted, to configure how long before a "no response" error event is emitted.

Finally, you should understand that I've had a watchdog-based implementation to do this for a very long time, and didn't require a single change to TTAPI to accomplish that feat. All you need to do in your watchdog routine is to check bot.lastActivity and make sure it hasn't been excessively far in the past. Since the data is readily available - just not documented - there's no need for an event to be emitted or responded to. (But, note that I didn't touch the 'alive' event that you added.)

technobly commented 11 years ago

@gizmotronic I'm pretty sure nobody besides you and I know much of what the hell we are talking about here... and I will add that I'm pretty sure you don't understand fully how my code works... AND I have some more comments about your example code... BUUUUT since I don't feel like typing up something to address your lengthy response... you win! :tongue:

gizmotronic commented 11 years ago

Actually, I was gonna say, I see that you're handling the wserror event, which I re-emitted as an error event. If I just changed on('wserror') to on('error') in your existing example, we'd both be pretty happy. Wouldn't we?

samuri51 commented 11 years ago

i think bot.('wserror') is more descriptive of what's happening than bot.on('error'), that could literally mean anything given the context. in the websocket api you have 'error', but there you know your dealing specifically with websocket's not turntable bot's... didn't see anything wrong with current way its being handled. i think the option for it to be handled programatically rather than directly by the bot is fine. plus it allows the programmer to handle it the way they want to. rather be restricted by a single implementation

gizmotronic commented 11 years ago

The errors are not all websocket errors. You can inspect the Error object to see the specific source of the error, if that's important to you. Also, I deliberately chose the 'error' event because it has special handling.

technobly commented 11 years ago

After looking at this more, the benefit of these tweaks to the setInterval don't seem needed. The only way you are going to lag out after 120 seconds is if you are not connected to turntable.fm, and if that happened it threw some websocket errors a long time ago. You might as well start trying to reconnect, and stop trying when you are connected again. The current implementation does this. What you have submitted has compiler issues, possibly due to the old coffee-script version you are using... and with the constant changes and lack of testing I'm going to close this for now unless anyone besides @gizmotronic has tested this and feels it's working and provides value.

gizmotronic commented 11 years ago

You're wrong; a transparent proxy issue can easily create a situation where the websocket doesn't throw an error, and yet you also don't get responses. I wrote the fail-safe code because I don't live in an ideal world, apparently unlike you. Also, the version of CoffeeScript I'm using happens to be exactly the same one that Alain has been using; I would prefer to use a newer version, but I don't feel it's my call to make.

technobly commented 11 years ago

Orly, http://i.imgur.com/mueoLce.png

gizmotronic commented 11 years ago

rly

If he's updated, I'll be more than happy to. I downgraded in order to submit pull requests #164 and #173. But, I'd love for you to point out the compilation problem since I'm clearly too thick to see any functional difference.

gizmotronic commented 11 years ago

You know what - part of this is my fault for assuming that everyone sees the wisdom in splitting functionally distinct changes into separate commits. Another place I failed was putting needed bug fixes into the same pull request as an enhancement. I'm splitting out the bug fixes - one of which was pre-existing but exacerbated by the reconnect change (the interval timer issue), and the other of which was introduced by the same - and committing them directly. I'll refactor and re-submit the enhancement as a new pull request.

By the way, this does not nullify my previous request to be shown the compilation problem. I'd really like to know what you see (you've helped me before, so I don't doubt your skill).

technobly commented 11 years ago

It was just compiling differently... just look at the diffs. Stuff you didn't change was being changed. There's no reason not to compile with v1.6.2 since that's what was used on the last two commits of bot.js.

I would agree, start with your bug fixes.

Then move on to your feature proposals.