hellozedan / node-xmpp-bosh

Automatically exported from code.google.com/p/node-xmpp-bosh
0 stars 0 forks source link

node-xmpp-bosh crash #5

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
After receiving a lots of requests from a BOSH client during ~2 seconds (~10 
requests), node-xmpp-bosh crashed.

It was used through a nginx proxy.

Here are the logs:

node.js:116
        throw e; // process.nextTick error, or 'error' event on first tick
        ^
Error: ENOTCONN, Transport endpoint is not connected
    at Socket._shutdownImpl (net.js:147:18)
    at Socket._shutdown (net.js:816:14)
    at Socket.flush (net.js:498:12)
    at Socket.end (net.js:835:14)
    at EncryptedStream.onend (stream.js:34:12)
    at EncryptedStream.emit (events.js:39:17)
    at Array.<anonymous> (tls.js:560:22)
    at EventEmitter._tickCallback (node.js:108:26)

Original issue reported on code.google.com by vanaryon on 13 May 2011 at 6:28

GoogleCodeExporter commented 8 years ago
Hello Vanaryon, Is there any way I can reproduce this at my end? Like a small 
test file (or driver)?

Original comment by dhruvb...@gmail.com on 13 May 2011 at 7:29

GoogleCodeExporter commented 8 years ago
Mhh, I did it using Jappix, adding lots of comments at the same time, manually.
You can get it here: https://project.jappix.com/ and use the jappix.com XMPP 
server (which have Pubsub support for comments), then login, post a microblog 
entry and send a lots of comments to overload the network.

Then, it crashed. But it might be the reason of the crash: node-xmpp-bosh was 
running on a 450Mhz powered server :)

Original comment by vanaryon on 13 May 2011 at 7:56

GoogleCodeExporter commented 8 years ago
Please could you try uncommenting this line:
    this._sock.removeAllListeners('error');

on line 150 of xmpp-proxy.js and check if you are still able to crash it?

Original comment by dhruvb...@gmail.com on 13 May 2011 at 8:02

GoogleCodeExporter commented 8 years ago
Mhh, I could not reproduce it when it was removed.

Might be due to that, but I cannot ensure you the bug was due to a large amount 
of requests. This time I requested event more: a packet/5millisecs!

And all worked fine :)

And, just to let you know, I think I will replace Punjab with your great BOSH 
server soon for the Jappix.com service ;)

Original comment by vanaryon on 14 May 2011 at 2:43

GoogleCodeExporter commented 8 years ago
Great! I've committed that change (rev. #202). I think it was left behind when 
I refactored the code but forgot to handle the earlier 'error' event that I was 
using - am now using the 'close' event which is guaranteed to be called once no 
matter what.

The bug was basically because the server was closing the connection (you seem 
to be using TLS) and the BOSH server wasn't handling that case very gracefully. 
Though I am surprised that it wasn't caught in my deployment (about 100 users 
online).

Great to hear that you are thinking of using NXB for your service! I would be 
interested in why you want to make the switch - since I've heard a lot of good 
things about Punjab and I think it is the de-facto out-of-process BOSH server 
most people use these days. NXB was *mainly* written to support:
* Request & Response ACKs and
* Multiple Streams

Closing issue. Do re-open if you see the bug again.

Original comment by dhruvb...@gmail.com on 14 May 2011 at 4:40

GoogleCodeExporter commented 8 years ago
Yep, I was using TLS for my connection when the crash occurred.

Well, I want to switch mainly because Punjab does not allow Facebook Chat 
server connections and Openfire server connections, strangely. And when it 
worked, newly created Openfire accounts were unusable (unable to login through 
BOSH!). It was a key issue, or something like that.

Before using Punjab I was using my own JHB fork, Palladium. But it did not 
allow TLS connections with invalid certificates. I fixed this but there were an 
issue with sticky TLS connections. Anyway, it is Java-powered and I did not 
like it much :)

And Link Mauve (a JavaScript guy!) told me about node-xmpp-bosh. I heard it was 
lightweight (Punjab consumes 80Mio RAM for 50 sessions). I also checked if it 
worked with Facebook & Openfire, and hopefully yes!

Finally, it is JS-powered, and I am also a JS guy :)

Thanks a lot for this work :)

By the way, I have a little question, does node-xmpp-bosh provide sessions 
pausing support? We would need it for Jappix Mini (https://mini.jappix.com/), 
because we are using ejabberd's BOSH for it and it makes the connection domain 
restricted to ours. Some users expressed their need to use their own server 
through our service ;)

Original comment by vanaryon on 14 May 2011 at 5:30

GoogleCodeExporter commented 8 years ago
Great! I hope you find NXB easy to work with!

As far as session pausing ( http://xmpp.org/extensions/xep-0124.html#inactive ) 
is concerned, NXB doesn't support it since from my reading of the XEP, it's 
just a way to temporarily increase the inactivity period. NXB otoh, does 
support the client setting the 'inactivity' period, so you could set it in the 
session creation request. You can control the maximum value of the inactivity 
period from the config file. However, I personally think it's a bad idea to 
keep this value more than 3min (180-200 sec) since that would mean that memory 
at the BOSH server would be used to buffer up the user's packets.

Original comment by dhruvb...@gmail.com on 14 May 2011 at 6:22

GoogleCodeExporter commented 8 years ago
Yep, I use it in Jappix Mini in some cases, but sessions pausing is really 
great, because when you change the page you're on, without any pause you can 
loose packets (like messages on a MUC).

Session pausing makes the pending packets come when the session is got back, 
which is safer.

Original comment by vanaryon on 14 May 2011 at 6:42

GoogleCodeExporter commented 8 years ago
I see your point. However, setting a higher inactivity period (say 3min) will 
give you the same effect. You can navigate across pages, and as long as you 
come back (send the correct SID/RID pair) within 3 mins (or inactivity period), 
you will get all the buffered packets back. Does this solve your problem?

Original comment by dhruvb...@gmail.com on 14 May 2011 at 6:46

GoogleCodeExporter commented 8 years ago
Oh, right. So I don't need to edit code at the client side, that's perfect :)

Original comment by vanaryon on 14 May 2011 at 6:50

GoogleCodeExporter commented 8 years ago
Exactly, you essentially get a higher inactivity for free - check the config 
file for "  default_inactivity_sec: 70". The default is 70 sec. You could bump 
it up for your deployment.

Original comment by dhruvb...@gmail.com on 14 May 2011 at 7:04