twitter / twemproxy

A fast, light-weight proxy for memcached and redis
Apache License 2.0
12.13k stars 2.06k forks source link

Twemproxy always has 'ESTABLISHED' tcp connection that In fact didn't exist. #329

Open deep011 opened 9 years ago

deep011 commented 9 years ago

When i execute "netstat" command on the twemproxy machine, and find many 'ESTABLISHED' tcp connections, some of them actually can not be found on the client machine.That means, the client tcp connections had been closed, but some of the twemproxy tcp connections still alive and state is 'ESTABLISHED'. How can i deal with this problem?

manjuraj commented 9 years ago

Twemproxy maintains persistent server connections. I believe all those established connections are server connections. Try: "ss -ta"

deep011 commented 9 years ago

No,they are not server connections. On client machine, there are only several hundred tcp connections to twemproxy, but the twemproxy machine has thousands 'ESTABLISHED' tcp connections from the client. Above the twemproxys, we use lvs for load balancing and high availability.

charsyam commented 9 years ago

@deep011 how about modifing net.ipv4.tcp_keepalive_time?

deep011 commented 9 years ago

@charsyam twemproxy enable the tcp_keepalive? In the twemproxy source code, i didn't found where set this socket option.

charsyam commented 9 years ago

@deep011 twemproxy don't have tcp_keepalive option. but maybe you can try this. http://libkeepalive.sourceforge.net/

charsyam commented 9 years ago

@deep011 I will make patch for this soon.

deep011 commented 9 years ago

@charsyam add setsockopt(tcp_keepalive) for the client socket?

charsyam commented 9 years ago

@deep011 Could you test this version? https://github.com/charsyam/twemproxy/tree/feature/KEEPALIVE

and add "tcpkeeaplive: true" option to your conf file(yml file) Thank you.

manjuraj commented 9 years ago

TCP keep-alive has nothing to do with connections being in established state. - https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=what+is+tcp+keepalive

@deep011 do the following:

If you see lot of connections in TIME_WAIT state, then: https://github.com/twitter/twemproxy/blob/master/notes/socket.txt#L61

deep011 commented 9 years ago

@manjuraj I run "ss -ta" and found same as before. Then, i doubted the problem is come from the lvs, because if i let client connect to twemproxy directly, both the number of 'ESTAB' tcp connections on client machine and the number of 'ESTAB' tcp connections on twemproxy machine are always equal. So i upgrade the lvs machine system from centos5.8 to centos6.6, then the problem is solved.

idning commented 9 years ago

One of my friend had encounter the same problem.

the reason maybe this:

you will get the reason when you run netstat -s on the LVS machine.

this ploblem will only happen when the client side using short connections.

we may add a idle_timeout config for twemproxy, and we should close the connection when it's idle.

manjuraj commented 9 years ago

@idning agree with idle_timeout config

idning commented 9 years ago

in this article http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html, keep-alive can be used to

in this LVS case, keep-alive will help on both of the two target:

  1. if the connection is close by LVS, twemproxy can detect this by keepalive
  2. periodical send a ACK between proxy and client(over LVS) may make LVS think that it's a active connection, so LVS will not close it.

so we need to add tcp-keepalive.

charsyam commented 9 years ago

@idning see this https://github.com/twitter/twemproxy/pull/330

deep011 commented 9 years ago

Thank you for everyone. ;)

soarpenguin commented 9 years ago

@deep011 change the lvs system version solved your problem?

i have the same problem.

we use lvs (DR mode) for load balancing and high availability. in the client (10.105.28.193): (10.103.188.109 is VIP) $ ss -tan | grep 10.103.188.109 ESTAB 0 0 10.105.28.193:7861 10.103.188.109:6378

in the nutcracker (10.103.188.109:6378): all 130 connection all ESTABLISHED $ ss -tan | grep ESTAB | awk '{print $NF}' | awk -F: '{print $1}' | sort | uniq -c | grep 10.105.28.193 130 10.105.28.193

@charsyam https://github.com/charsyam/twemproxy/tree/feature/KEEPALIVE is not works.

soarpenguin commented 9 years ago

in the client (10.105.28.193): (10.103.188.109 is VIP) $ ss -tan | grep 10.105.28.193 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:22348 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:22905 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:4266
ESTAB 0 0 10.103.188.109:6378 10.105.28.193:39525 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:2136
ESTAB 0 0 10.103.188.109:6378 10.105.28.193:39964 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:50179 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:61623 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:58761 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:29345 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:6215
ESTAB 0 0 10.103.188.109:6378 10.105.28.193:22085 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:54271 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:63394 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:43591 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:53810 ESTAB 0 0 10.103.188.109:6378 10.105.28.193:4045
ESTAB 0 0 10.103.188.109:6378 10.105.28.193:64603

client program is only open one connenction at one time, such connection didn't exist in fact, client is closed.

deep011 commented 9 years ago

@soarpenguin Add my qq : 530166298

nanzhushan commented 9 years ago

I have meet the same problem

deep011 commented 9 years ago

Hi, all Now summary this issue: If you use "lvs + twemproxy" frame, and the twemproxy don't enable tcpkeepalive option, twemproxy would probable have 'ESTABLISHED' tcp connection that In fact didn't exist. All this because the lvs "expire" mechanism. So, if you used lvs upon twemproxy, enable the twemproxy tcpkeepalive option. This option already added in the newest master branch.

manjuraj commented 9 years ago

Thanks @deep011 for summarizing this issue

charsyam commented 9 years ago

@deep011 Thanks for your summary I have a question about that.

In uppoer comment, you said "@charsyam https://github.com/charsyam/twemproxy/tree/feature/KEEPALIVE is not works."

but you said "enable the twemproxy tcpkeepalive option"

Could you explain more? maybe I think all most same that branches code and master version :)

deep011 commented 9 years ago

@manjuraj This is what I should do.

@charsyam That's not what I said. I add tcpkeepalive option to my twemproxy, and it works.^-^

nanzhushan commented 9 years ago

Thanks for all. I am testing...

manjuraj commented 9 years ago

should I close this issue, now that tcpkeepalive option is available?

Also would someone like to send a pull request to update https://github.com/twitter/twemproxy/blob/master/notes/recommendation.md with this recommendation for setting tcpkeepalive. Thanks!

deep011 commented 9 years ago

@manjuraj ok, i send a pull request, and you review it please.

nanzhushan commented 9 years ago

@manjuraj @deep011 @all I have test the tcpkeepalive,It is ok,but I have met another problem, I use webbech to test the performance,if I I use “webbench -c 500 -t 30 http://192.168.2.95/set.php ” to connect twemproxy, It happend "29 failed", If I use "webbench -c 800 -t 30 http://192.168.2.95/set1.php" to connect redis directly,it all successed. and twemproxy only proxy one redis. (the set.php is to connect twemproxy,and set key,the set1.php is to connect redis and set key) it means that through twemproxy,the perfomance have get down,That's why?

Marcus366 commented 9 years ago

@knight-zhou I think it is better to open a new issue to discuss the performance problem. Moreover, could you explain more detailed on your test environment such as twemproxy config file because I make a simple test as yours with all servers in localhost but it turn out that there is no apparent performance reduction.

edennis-sge commented 9 years ago

I am not using lvs, but I am having this same problem. In my case, it looks like connections made by clients that run in docker containers on CoreOS are ending up in the state described here: connections are shown as ESTABLISHED on the twemproxy host side, with no corresponding connections listed on the client side.

@idning your idea of adding idle_timeout seems like a good one.

alswl commented 8 years ago

Met it in physical machine, CentOS 6.

Server have 8k Connection established, but in client(one of 20 instances), it only keeps 20+ connection.

shnwang commented 8 years ago

I also have the same problem .

deep011 commented 8 years ago

@wshn13 try twemproxies :)

yongman commented 7 years ago

Thanks everyone!