Proposal, SO_REUSEADDR before connect

GoogleCodeExporter commented 9 years ago

I have proxies with more than 10-20k outbound connections. At peak time i
am having something like: 
2010/02/03 21:24:49| commBind: Cannot bind socket FD 3892 family 2 to
0.0.0.0 port 0: (98) Address already in use

According:
http://www.ibm.com/developerworks/linux/library/l-sockpit/

and according Linux man:
"      Linux  will  only  allow  port re-use with the SO_REUSEADDR option
when this option was set both in the previous program that performed a
bind(2) to the
       port and in the program that wants to re-use the port.  This differs
from some implementations (e.g., FreeBSD) where only the later program
needs to set
       the SO_REUSEADDR option.  Typically this difference is invisible,
since, for example, a server program is designed to always set this option.
"

There is sure system-wide solution also (sometimes required as addon to
mentioned socket option)
net.ipv4.tcp_tw_recycle
net.ipv4.tcp_tw_reuse

Both of them helping. Probably it is good idea to implement this in code
with ifdef to LINUX?

I guess in
comm_fdopen6(int new_socket,

to move out 
commSetReuseAddr(new_socket);
(by default it is being set i guess only to listening ports, i hope it is
harmless in case of UDP).

Original issue reported on code.google.com by nuclear...@gmail.com on 3 Mar 2010 at 9:47

GoogleCodeExporter commented 9 years ago

Hm. Wait, so what you're saying is that Linux will benefit from having 
SO_REUSEADDR set on -all- incoming and 
outbound connections?

Original comment by adrian.c...@gmail.com on 20 Mar 2010 at 8:53

GoogleCodeExporter commented 9 years ago

Yes, because in Linux for outgoing connection even, socket will stay in 
TIME_WAIT for
some time. This means port cannot be reused at this time.
Maybe knob in config? Because in some cases this option, if enabled, can have 
as i
understand negative, security issues.

I guess it is also better to mention in manuals or somewhere, if someone see 
this
error message, he should take a look also to:
net.ipv4.tcp_tw_recycle
net.ipv4.tcp_tw_reuse

Original comment by nuclear...@gmail.com on 20 Mar 2010 at 2:12

GoogleCodeExporter commented 9 years ago

SO_REUSEADDR is generally only useful where something bind()'s to a port for 
listening, and one needs to rebind quickly after a process restart or crash.

I have no evidence that this is at all useful for outbound connections. In 
fact, the only way I know to control that behaviour is to use the tw_recycle 
bits, or tweak the tcp fin/ack/syn timeouts.

Original comment by roelf.di...@gmail.com on 17 Oct 2010 at 5:58

GoogleCodeExporter commented 9 years ago

I can confirm that the net.ipv4.tcp_tw_recycle=1 setting is a general cure for 
this.

Original comment by roelf.di...@gmail.com on 4 Nov 2010 at 10:47

GoogleCodeExporter commented 9 years ago

tw_recycle harmful for NAT, i had a lot of situations, when you set this knob - 
NAT clients will not be able to work normally with proxy (stalled connections, 
or tcp connection was unable to establish, i dont remember, tried that 
recently).

My solution was seems was in other parameter(?).
net.ipv4.tcp_orphan_retries by default was 0, when i set it to 1 - my problem 
disappeared.

Original comment by nuclear...@gmail.com on 10 Nov 2010 at 2:15

GoogleCodeExporter commented 9 years ago

tcp_orphan_retries didn't work for me.

Though /prov/sys/net/ipv4/tcp_tw_reuse works, although I get a lot of:

TCP: time wait bucket table overflow

Original comment by robertpi...@gmail.com on 10 Nov 2010 at 9:00

GoogleCodeExporter commented 9 years ago

By the way, I don't have that many connections, mgr:info shows this:

Squid Object Cache: Version LUSCA_HEAD-r14756
Start Time: Wed, 10 Nov 2010 19:41:29 GMT
Current Time:   Wed, 10 Nov 2010 21:06:06 GMT
Connection information for Squid:
    Number of clients accessing cache:  4465
    Number of HTTP requests received:   3892521
    Number of ICP messages received:    0
    Number of ICP messages sent:    0
    Number of queued ICP replies:   0
    Request failure ratio:   0.03
    Average HTTP requests per minute since start:   45995.1
    Average ICP messages per minute since start:    0.0
    Select loop called: 11358493 times, 0.447 ms avg
Cache information for Squid:
    Request Hit Ratios: 5min: 0.0%, 60min: 0.0%
    Byte Hit Ratios:    5min: -3.2%, 60min: -2.6%
    Request Memory Hit Ratios:  5min: 0.0%, 60min: 0.0%
    Request Disk Hit Ratios:    5min: 0.0%, 60min: 0.0%
    Storage Swap size:  0 KB
    Storage Mem size:   179180 KB
    Mean Object Size:   0.00 KB
    Requests given to unlinkd:  0

Is there something I can try to fix this?

"commBind: Cannot bind socket FD 17164 family 2 to 0.0.0.0 port 0: (98) Address 
already in use"

I'm getting these around 232 times per-second on cache.log

Original comment by robertpi...@gmail.com on 10 Nov 2010 at 9:10

GoogleCodeExporter commented 9 years ago

half_closed_clients off
?

Original comment by nuclear...@gmail.com on 15 Nov 2010 at 3:49

GoogleCodeExporter commented 9 years ago

Can agree with nuclearcat regards: net.ipv4.tcp_tw_recycle=1 being harmful. In 
my tproxy setup I got the exact behaviour, stalling, connect failures etc.

nuclearcat: Are you running a tproxy setup ?

Original comment by ro...@neology.co.za on 16 Nov 2010 at 12:01

GoogleCodeExporter commented 9 years ago

Er, never mind if you're getting commBind errors, then likely not.

Original comment by ro...@neology.co.za on 16 Nov 2010 at 12:04

GoogleCodeExporter commented 9 years ago

what does mgr:curcounters show ?

Original comment by ro...@neology.co.za on 16 Nov 2010 at 12:10

GoogleCodeExporter commented 9 years ago

Just furthering research on the topic in the linux kernel: 

From: inet_bind() in http://lxr.linux.no/linux+v2.6.36/net/ipv4/af_inet.c#L511

ERRINUSE will be returned if inet_csk_get_port() from 
http://lxr.linux.no/linux+v2.6.36/net/ipv4/inet_connection_sock.c#L120
was unable to find a free socket to bind() to from the bind bucket.

There are some checks in inet_csk_get_port() to see if sk->sk_reuse has been 
set, which is controlled with the setsockopt(SO_REUSEADDR) so this request may 
actually contain some value.

Can you try it with a patch against comm.c such as:

 } else if (! sqinet_is_noaddr(&F->local_address)) {
+       commSetReuseAddr(new_socket);
        if (commBind(new_socket, &F->local_address) != COMM_OK) {
            comm_close(new_socket);
            return -1;
        }
    }
    F->local_port = sqinet_get_port(a);

Original comment by ro...@neology.co.za on 16 Nov 2010 at 12:35

GoogleCodeExporter commented 9 years ago

OK, having played with the above patch, it didn't really make a major 
difference when apachebenching either a transparent, or tproxied lusca.

What did help immensely was the following:
net.ipv4.tcp_max_orphans = 8192
net.ipv4.tcp_orphan_retries = 1

Original comment by roelf.di...@gmail.com on 18 Nov 2010 at 9:14

google-code-export / lusca-cache

Proposal, SO_REUSEADDR before connect #89