twitter / twemproxy

A fast, light-weight proxy for memcached and redis
Apache License 2.0
12.15k stars 2.06k forks source link

why some listening ports gone? #227

Open shibo0305 opened 10 years ago

shibo0305 commented 10 years ago

Hello, manjuraj: we use nutcracker for redis proxy. However, sometimes we found that several ports which nutcracker is listening suddenly gone, It seems that nutcracker not listening on those ports any longer! Then we have to restart nutcracker. Could you tell me the possible reason for this problem and how to track and solve it? Thanks very much!

sax commented 10 years ago

Are you using the latest version? We saw a similar issue, but it was fixed in January.

idning commented 10 years ago

this never happened for us.

is the process still alive? what's netstat output?

shibo0305 commented 10 years ago

@sax we are using version 0.2.4,and we plan to use 0.3.0. which version are you using? @idning yes, the process is still alive, but netstat output says several ports are gone while other ports are still alive.

sax commented 10 years ago

@shibo0305 0.2.4 definitely includes the fix for the problem we saw. Are you using any proxy software in front of twemproxy? Can you increase the log level of twemproxy and view any information in the logs when it closes down the affected sockets?

shibo0305 commented 10 years ago

@sax thx for reply! we are not using any other proxy in front of nutcracker, we merely use a client lib for golang called "redigo". The problem often occurs suddenly and we haven't found the way to reappear it. have you found the way to reappear the problem? we didn't configure the log before, so could you give me some advice about it, such as which level is appropriate and so on?

sax commented 10 years ago

@shibo0305 You can see here https://github.com/twitter/twemproxy/blob/master/src/nc_signal.c#L92-L95 that SIGTTIN ups the log level. https://github.com/twitter/twemproxy/blob/master/src/nc_log.h#L28-L39 is where you can see what the log levels do. It's been long enough that I don't remember what log level started demonstrating my problem, but it definitely showed up in 11. You might need to include the -o command line option if you're not already doing so or capturing stderr somewhere.

shibo0305 commented 10 years ago

@sax the log shows that "open too many files", and then the port is gone. I think this is a bug, isn't it?

sax commented 10 years ago

I have not had this problem with twemproxy, but it sounds similar to when you reach your max file descriptor limit. How many connections do you have to twemproxy, and what is your max file descriptor limit set at ulimit -n? The defaults in many operating systems is fairly low.

I think it would be interesting to get more information, for instance do you see many connections in TIME_WAIT using netstat -an? Is this an issue with your clients, or is it a setting in your operating system, or is it a bug in twemproxy? http://stackoverflow.com/questions/880557/socket-accept-too-many-open-files

allenlz commented 10 years ago

As far as I know, if reaching max file descriptor limit, twemproxy will close the proxy listening port after accept failure. https://github.com/twitter/twemproxy/issues/97

PR: https://github.com/twitter/twemproxy/pull/232

bitthegeek commented 10 years ago

Are there any other apps running on your twemproxy server? We resolved a similar problem by isolating a rouge app on a different server (sockets not closed properly)

allenlz commented 10 years ago

Our twemproxy cluster is shared by serval apps. Some app maintains long connection with the server. And we can't guarantee those clients are bug-free. So we choose to fix the bug in twemproxy.