Closed davedoesdev closed 11 years ago
@davedoesdev good catch.
I've a couple of your issues to walk trough now. Do you fancy a pull-request with an added test?
@davedoesdev however a good node practice is to NEVER throw an exception.
So to speak, if the emit
calls throws, then the exception should be routed inside the error
event of the Connection
.
@davedoesdev Any chance you have a script to reproduce this?
It's pretty clear what the problem is though - the parsing code hasn't been updated properly to include that ${property}_and_length stuff. It's currently:
/* Parse protocol id */
protocol_and_len = parse_string(buf, len, pos);
packet.protocolId = protocol_and_len[0];
if(packet.protocolId === null) return null;
pos += protocol_and_len[1] + 2;
and it should be:
protocol_and_len = parse_string(buf, len, pos);
if (!protocol_and_len) return new Error('Can't parse protocol!');
packet.protocolId = protocol_and_len[0];
pos += protocol_and_len[1] + 2;
Bad bad bad.
I can still get this to happen in a mocha test but only if I stop mocha's uncaughtException handler and instead ignore the exception. Then the packet never gets reset and the puback gets interpreted as a publish. I'm not sure this is valid anyway - I think an uncaughtException should cause process exit.
@davedoesdev an uncaughtException should not cause the process to exit. You will lose all your concurrent connected clients.
Could you please try https://github.com/adamvr/MQTT.js/commit/814202301851842745df7765e2b328610edd83c1 and confirm it solves the problem?
In this way, if everything happen in the publish
event handler, it's jailed and it can't mess up the MQTTConnection.
That looks good because it will emit an "error" so the process will exit unless handled, right?
What part of parse#publish
is still raising an exception? I would have thought all of those moles would have been whacked by #94, or is it a different problem?
I just threw an exception in my handler
Yep, https://github.com/adamvr/MQTT.js/commit/814202301851842745df7765e2b328610edd83c1 fixed it. The test in there is basically the same as mine.
Reopening, as the commit still needs to be merged in master and released.
Try-catch? In the most called function in the library? :scream:
So the solutions as I see them are (in order of my personal preference):
process.on('uncaughtException', function() { if (client.stream.readable) client.parse(); } )
?I'm keen to experiment with these domain things. What do you guys reckon?
I think domains are the way to go :+1: , I'm using them almost everywhere and they are fine.
Not used domains yet. Have you instrumented the effect of try-catch?
Seems it makes a big difference so yeah I think domains look interesting. I wouldn't mess with uncaughtExceptions, seems a bit hacky.
No votes for 'encourage proper exception handling by breaking violently'? Okay, I'll have poke around domains.
Okay, here are the situations when this might come up:
Handler throws its own exceptions and deals with them itself - not our problem, doesn't cause us problems
client.on('publish', function(packet) {
try {
throw 'error!';
catch(e) {
//
}
});
Handler throws exception and it bubbles up to uncaughtException
- not (really) our problem, causes us to stop parsing until the next packet comes (or stream becomes readable again, somehow)
client.on('publish', function(packet) {
throw 'error!';
});
process.on('uncaughtException', function() { console.log('proceed!'); });
So as I see it, user thrown exceptions aren't our responsibility. If it hits the uncaughtException handler other parts of the app are going to be invalid states anyway and as the docs say
Don't use it, use domains instead. If you do use it, restart your application after every unhandled exception!
Am I missing anything here?
Naturally, it's not the same case with MqttConnection
s, since there's more to do there (restart timers, reconnect, etc.) but it seems to me that thrown exceptions should be handled as close as possible to where they were thrown from.
User-thrown responsibility are not our problem, but we have to provide a way for a user to specify the domain which should be attached to that connection/client. Supporting domains might just be an example were domains are used, to show how to do it.
Yes, let the exception go and let the user deal with it. I'm in the 'stop the process (or domain I guess) and restart it' camp but that's my choice.
Well, the real thing in node.js error handling is that you should never ever have to stop process, otherwise you are closing all the 10k connections :)
For servers yes I'm striving for that, but on client I like to use a supervisor. Even on the server, if something unrecoverable happens (like you have some state that's difficult to recover cleanly) then those 10k connections will just have to reconnect. Ideally the clients should be tolerant of frequent connection problems.
With regards to connection problems - they're swallowed now (since the whole automatic reconnection thing results in a million econnrefused and epipes) so they should be fairly tolerant now.
This means the only errors that can be produced by mqttclients are now only user errors (e.g. bad arguments) or bad server errors (e.g. can't parse messages).
If you think you're going to do either of those things, you can always wrap your mqttclients in domains:
var d = require('domain').create()
, client = require('mqtt').createClient();
d.add(client);
d.on('error', console.dir);
client.on('connect', ...);
Oh, and if you're throwing exceptions in event handlers (which you should only really be doing if you want to crash the app, in event of an unrecoverable error or something) it's on you to handle that as well, either through domains or wrapping the parts of the handler that throw in a try-catch block.
Nice discussion though! Thanks @davedoesdev @mcollina!
Seems a good conclusion. I do use exceptions to intentionally crash the app but leaving it up to the user is a good thing.
Having trouble coming up with a test because it needs a failure to expose this. What I did was:
I then got this exception:
Basically, the subsequent puback still had its cmd set to 'publish'. This is because the packet isn't set back to {} until after the handler returns (and because mine threw an exception, it never returned).
Changing these lines in Connection.prototype.parse:
to:
fixed this.