madhuneal / lusca-cache

Automatically exported from code.google.com/p/lusca-cache
0 stars 0 forks source link

TPROXY related bind() failures #34

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
After patching Lusca to fix the bind related reporting, I'm seeing this:

2009/06/30 05:27:50| commBind: Cannot bind socket FD 1239 family 2 to
10.21.254.225 port 0: (49) Can't assign requested address
2009/06/30 05:27:50| commResetFD: bind: (49) Can't assign requested address

The first commBind() fails for whatever reason. commResetFD() then fails
for the same reason.

This may stop some outbound connections from ever occuring.

Original issue reported on code.google.com by adrian.c...@gmail.com on 30 Jun 2009 at 5:00

GoogleCodeExporter commented 9 years ago
This is all part of the socket connection code, beginning with 
commConnectStart().

commResetFD() is simply retrying the connection with the same socket 
configuration.
This will almost certainly fail a second time for whatever reason(s) the first
failed. (I'm still not sure what that may be - eg, port exhaustion? Not setting 
the
correct tproxy related flags on the socket before setting a non-local IP for the
request?)

In any case, this needs to be investigated further.

Original comment by adrian.c...@gmail.com on 30 Jun 2009 at 5:05

GoogleCodeExporter commented 9 years ago
Hi Adrian,

Please merge my ticket into this one.

Original comment by Fraw...@gmail.com on 6 Jul 2009 at 12:27

GoogleCodeExporter commented 9 years ago
The interesting question is why a socket is being created with a non-local 
address but without the 
COMM_TPROXY_REM flag being set.

Seeing commBind() errors means that the socket creation path isn't going via 
the tproxy module(s) for whatever 
reason.

Original comment by adrian.c...@gmail.com on 6 Jul 2009 at 3:26

GoogleCodeExporter commented 9 years ago
The source of the problem here is commResetFD(). It is hand grovelling around 
manually with FDs rather than 
creating a new one using comm_open().

Original comment by adrian.c...@gmail.com on 6 Jul 2009 at 3:29

GoogleCodeExporter commented 9 years ago

Original comment by adrian.c...@gmail.com on 6 Jul 2009 at 3:29

GoogleCodeExporter commented 9 years ago
I have been usint TProxy4 with lusca_head so far and have never seen any of this
issue so far ....

anyhow i shall keep things running untill anything happens 

so far everything is runing smooth in bridge mode ...

next monday i am going to add a router mode with tproxy4 ....

much regards 

Original comment by degreane@gmail.com on 6 Jul 2009 at 7:32

GoogleCodeExporter commented 9 years ago
You'll only see this happen if a connect() to a remote host fails for some 
reason and Lusca retries the connection.

Original comment by adrian.c...@gmail.com on 6 Jul 2009 at 7:39

GoogleCodeExporter commented 9 years ago
Just committed r14138 in an attempt to patch over this until I've dedicated 
some more
time to properly tidying up the whole comm connect and restarting code. It 
truely is
a horrible, hacky mess.

Note that it will verbosely log whenever a tproxy'ed connection is retried. I'll
disable this in a future commit when I'm certain this is working correctly.

Original comment by adrian.c...@gmail.com on 6 Jul 2009 at 8:45

GoogleCodeExporter commented 9 years ago
Issue 38 has been merged into this issue.

Original comment by adrian.c...@gmail.com on 8 Jul 2009 at 8:12

GoogleCodeExporter commented 9 years ago
This patch fixes it in my environment.

Kris, does current LUSCA_HEAD fix the commBind/commResetFD issue for you?

Original comment by adrian.c...@gmail.com on 8 Jul 2009 at 8:13

GoogleCodeExporter commented 9 years ago
I've disabled the default logging of commResetFD() in r14282. The problem has 
been
resolved, if rather hackyish.

Original comment by adrian.c...@gmail.com on 13 Aug 2009 at 11:36