wandenberg / nginx-push-stream-module

A pure stream http push technology for your Nginx setup. Comet made easy and really scalable.
Other
2.22k stars 295 forks source link

channels-stats number of subscribers #64

Closed comboy closed 11 years ago

comboy commented 11 years ago

When I'm querying channel-stats, I noticed that sometimes number of subscribers jumps almost to zero, only to get back to "normal" value within a few seconds. Is this normal? What could be causing this? I did not notice any disconnects in browser, but I'm not sure how about other users. I'm using "eventsource|websocket". "Normal" number of subscribers is ~ 2.5k

PS. This module is awesome, great features, implementation and documentation. Memory footprint is amazing.

comboy commented 11 years ago

This is how it looks like:

2013-03-27 21:08:30 +0000 subscribers: 2449
2013-03-27 21:08:31 +0000 subscribers: 2448
2013-03-27 21:08:32 +0000 subscribers: 2453
2013-03-27 21:08:33 +0000 subscribers: 2447
2013-03-27 21:08:34 +0000 subscribers: 2450
2013-03-27 21:08:35 +0000 subscribers: 2449
2013-03-27 21:08:36 +0000 subscribers: 2454
2013-03-27 21:08:37 +0000 subscribers: 2454
2013-03-27 21:08:38 +0000 subscribers: 2455
2013-03-27 21:08:39 +0000 subscribers: 2456
2013-03-27 21:08:40 +0000 subscribers: 600
2013-03-27 21:08:41 +0000 subscribers: 603
2013-03-27 21:08:42 +0000 subscribers: 606
2013-03-27 21:08:43 +0000 subscribers: 609
2013-03-27 21:08:44 +0000 subscribers: 709
2013-03-27 21:08:45 +0000 subscribers: 1015
2013-03-27 21:08:46 +0000 subscribers: 1312
2013-03-27 21:08:47 +0000 subscribers: 1737
2013-03-27 21:08:48 +0000 subscribers: 2002
2013-03-27 21:08:49 +0000 subscribers: 2164
2013-03-27 21:08:50 +0000 subscribers: 2286
2013-03-27 21:08:51 +0000 subscribers: 2347
2013-03-27 21:08:52 +0000 subscribers: 2364
2013-03-27 21:08:53 +0000 subscribers: 2377

It happens approx every 40 seconds. In browser I see new request to /ev/ roughly every 40s, and on the server I get ton of requests every 40s (grouping probably after nginx restert?) Sometimes it seems to keep this connection as it should, and looking at number there are about 500 such lucky users.

wandenberg commented 11 years ago

Please, send your configuration file. Are you using the subscriber connection ttl directive? Which value? On Mar 27, 2013 6:11 PM, "Kacper Cieśla" notifications@github.com wrote:

This is how it looks like:

2013-03-27 21:08:30 +0000 subscribers: 2449 2013-03-27 21:08:31 +0000 subscribers: 2448 2013-03-27 21:08:32 +0000 subscribers: 2453 2013-03-27 21:08:33 +0000 subscribers: 2447 2013-03-27 21:08:34 +0000 subscribers: 2450 2013-03-27 21:08:35 +0000 subscribers: 2449 2013-03-27 21:08:36 +0000 subscribers: 2454 2013-03-27 21:08:37 +0000 subscribers: 2454 2013-03-27 21:08:38 +0000 subscribers: 2455 2013-03-27 21:08:39 +0000 subscribers: 2456 2013-03-27 21:08:40 +0000 subscribers: 600 2013-03-27 21:08:41 +0000 subscribers: 603 2013-03-27 21:08:42 +0000 subscribers: 606 2013-03-27 21:08:43 +0000 subscribers: 609 2013-03-27 21:08:44 +0000 subscribers: 709 2013-03-27 21:08:45 +0000 subscribers: 1015 2013-03-27 21:08:46 +0000 subscribers: 1312 2013-03-27 21:08:47 +0000 subscribers: 1737 2013-03-27 21:08:48 +0000 subscribers: 2002 2013-03-27 21:08:49 +0000 subscribers: 2164 2013-03-27 21:08:50 +0000 subscribers: 2286 2013-03-27 21:08:51 +0000 subscribers: 2347 2013-03-27 21:08:52 +0000 subscribers: 2364 2013-03-27 21:08:53 +0000 subscribers: 2377

It happens every approx every 40 seconds

— Reply to this email directly or view it on GitHubhttps://github.com/wandenberg/nginx-push-stream-module/issues/64#issuecomment-15553405 .

comboy commented 11 years ago

Here's the config https://gist.github.com/comboy/a988e4e38f6be28767f6

I've tried to modify almost all of directives visible in it ;) I did not know about subscriber connection ttl option (it does not seem to be listed)

Also I tried both nginx 1.2.6-r1 and 1.3.11

(You will notice my message template is different, I'm pushing JSONs there, but I doubt it's relevant)

comboy commented 11 years ago

some more data in case it could help by any chance, I'm clueless :/ (last nginx restart was ~15h ago, but many connected people leave browsers opened like forever)

comboy commented 11 years ago

one more hint, judging from incoming data in the browser, connection seems to be getting broken when some bigger amount of data is pushed (so this 40s interval may be just because that's where some bigger chunk is emitted by application), amount of data is ~130kb (seems to vary sometimes). I tried turning of gzip and proxy_buffering but no luck.

edit: this is definitely associated with bigger data push, however when I limited amount of data pushed at once from 130kB to around 50kB same amount of people still gets disconnected

wandenberg commented 11 years ago

Once the module receive a return code different from NGX_OK when writing to a socket, it closes the connection. This probably is happening in your server due to a small socket write buffer. Try to increase these write/read socket buffer to an amount proper to your application (size/frequency of messages)

comboy commented 11 years ago

You Sir, are a wizard. I set sndbuf=32K and it works great! http://bitcoinity.org/markets is now working smoothly thanks to you. Maybe apart from paypal donation button you could add your bitcoin address in readme? I'll gladly be the first one to test it :)

comboy commented 11 years ago

Unfortunately it still seems to be happening. I mean, with no sndbuf setting, it is dropping down to about 1k subscribers from current ~7k. But even with it (btw, I have no idea how to estimate what should be the value there, I tried even some randomness like 2048M), it still keeps dropping from 7k something to 6k something (sudden drop, then they slowly reconnect)

Any hints where too look for possible causes?

wandenberg commented 11 years ago

Certainly you have to tune your OS to support your application. You can get some clues here https://gist.github.com/dctrwatson/0b3b52050254e273ff11 this is a configuration of another module user which supports a large number of subscribers, channels and published messages.

comboy commented 11 years ago

Thanks a lot for these configs. Unfortunately even with these, as soon as some big message is pushed to everyone, some percentage of connections is lost. Until big message arrives everything is fine.

wandenberg commented 11 years ago

What parameters have you changed on this last test?

On Wed, Apr 10, 2013 at 6:04 AM, Kacper Cieśla notifications@github.comwrote:

Thanks a lot for these configs. Unfortunately even with these, as soon as some big message is pushed to everyone, some percentage of connections is lost. Until big message arrives everything is fine.

— Reply to this email directly or view it on GitHubhttps://github.com/wandenberg/nginx-push-stream-module/issues/64#issuecomment-16162912 .

comboy commented 11 years ago

Sorry for responding so late. I tried basically every single param from these configs, adding them one by one, and getting rid of these in my config that were not present in yours. It did not help. I was however in an openvz container. But I did monitor all the limits, and also modified sysctl on the host. No luck, I have no idea if that's something with openvz networking stack or some screw up on my side.

Anyway, I now switched to a dedicated box for this, and cannot reproduce it anymore (yay!). It's really amazing piece of software and oustanding work you have done. I've already recommended it to a few people. Thanks.

comboy commented 10 years ago

Hi,

I'm having some problems with newest version of the module and new nginx, with some older versions it works fine on another server (needs a restart sometimes, but it works great other than that).

2013/12/13 11:16:25 [alert] 26188#0: worker process 26239 exited on signal 11 2013/12/13 11:16:25 [alert] 26188#0: shared memory zone "push_stream_module" was locked by 26239 2013/12/13 11:16:25 [notice] 26188#0: start worker process 26367 2013/12/13 11:16:25 [notice] 26188#0: signal 29 (SIGIO) received 2013/12/13 11:16:31 [notice] 26188#0: signal 17 (SIGCHLD) received

I've been trying to change some config parameters but it's just turning random knobs for me since I don't know any nginx internals. I imagine I have a lot of very stupid settings in there already.

Is there any chance you could help me with that? I'd gladly pay for this help since it's very specific to my setup please just tell me your hour rate or how could we work together. I can provide you config and full access to the server. I'd appreciate your help.

best regards, Kacper

On Wed, Apr 10, 2013 at 2:47 PM, Wandenberg Peixoto < notifications@github.com> wrote:

What parameters have you changed on this last test?

On Wed, Apr 10, 2013 at 6:04 AM, Kacper Cieśla notifications@github.comwrote:

Thanks a lot for these configs. Unfortunately even with these, as soon as some big message is pushed to everyone, some percentage of connections is lost. Until big message arrives everything is fine.

— Reply to this email directly or view it on GitHub< https://github.com/wandenberg/nginx-push-stream-module/issues/64#issuecomment-16162912>

.

— Reply to this email directly or view it on GitHubhttps://github.com/wandenberg/nginx-push-stream-module/issues/64#issuecomment-16171789 .

wandenberg commented 10 years ago

Hi Kacper,

Of course I can try to help you. Try to generate a core dump when the problem happens. (check here for some instructions http://wiki.nginx.org/Debugging) Send me your configuration and a list of modules compiled with your nginx, usually a "nginx -V" list everything. If you want, send me the access to your server, to be easier to help you.

About payment, if we solve the problem you can do a donation for the project ;)

Regards, Wandenberg

On Fri, Dec 13, 2013 at 8:21 AM, Kacper Cieśla notifications@github.comwrote:

Hi,

I'm having some problems with newest version of the module and new nginx, with some older versions it works fine on another server (needs a restart sometimes, but it works great other than that).

2013/12/13 11:16:25 [alert] 26188#0: worker process 26239 exited on signal 11 2013/12/13 11:16:25 [alert] 26188#0: shared memory zone "push_stream_module" was locked by 26239 2013/12/13 11:16:25 [notice] 26188#0: start worker process 26367 2013/12/13 11:16:25 [notice] 26188#0: signal 29 (SIGIO) received 2013/12/13 11:16:31 [notice] 26188#0: signal 17 (SIGCHLD) received

I've been trying to change some config parameters but it's just turning random knobs for me since I don't know any nginx internals. I imagine I have a lot of very stupid settings in there already.

Is there any chance you could help me with that? I'd gladly pay for this help since it's very specific to my setup please just tell me your hour rate or how could we work together. I can provide you config and full access to the server. I'd appreciate your help.

best regards, Kacper

On Wed, Apr 10, 2013 at 2:47 PM, Wandenberg Peixoto < notifications@github.com> wrote:

What parameters have you changed on this last test?

On Wed, Apr 10, 2013 at 6:04 AM, Kacper Cieśla notifications@github.comwrote:

Thanks a lot for these configs. Unfortunately even with these, as soon as some big message is pushed to everyone, some percentage of connections is lost. Until big message arrives everything is fine.

— Reply to this email directly or view it on GitHub<

https://github.com/wandenberg/nginx-push-stream-module/issues/64#issuecomment-16162912>

.

— Reply to this email directly or view it on GitHub< https://github.com/wandenberg/nginx-push-stream-module/issues/64#issuecomment-16171789>

.

— Reply to this email directly or view it on GitHubhttps://github.com/wandenberg/nginx-push-stream-module/issues/64#issuecomment-30499278 .

comboy commented 10 years ago

I've sent you an e-mail, I'm not sure I got the proper one, if not please ping me at kacper.ciesla at gmail.com

thanks a lot