just4est / jmxtrans

Automatically exported from code.google.com/p/jmxtrans
0 stars 0 forks source link

Too many TCP connections in TIME_WAIT (tcp6) #6

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
On our graphite  server I see many TCP connections on port 2003.
This server (Ubuntu 10.10 64bits) host both graphite and JMXTrans.

tcp        0      0 0.0.0.0:2003            0.0.0.0:*               LISTEN

normal listen port (ipv4)

tcp6       0      0 123.123.123.123:56228   123.123.123.123:2003    TIME_WAIT
tcp6       0      0 123.123.123.123:56231   123.123.123.123:2003    TIME_WAIT
tcp6       0      0 123.123.123.123:56296   123.123.123.123:2003    TIME_WAIT

About 219 connections in TIME_WAIT (ipv6).

I can't locate them with lsof on my system and it perturb a bit the whole 
system since we miss openable ports.

I forced JMXTrans to stick with IPv4 adding -Djava.net.preferIPv4Stack=true to 
java command line and it fixed the problem.

Something to be added in the doc or README.

Original issue reported on code.google.com by henri.gomez on 27 May 2011 at 5:59

GoogleCodeExporter commented 8 years ago
Ok, I just committed a fix. Could you please try it?

Original comment by latch...@gmail.com on 27 May 2011 at 4:05

GoogleCodeExporter commented 8 years ago
Tried with the fix but no more luck.
After I killed jmxtrans, there is still TIME_WAIT connections (322 exactly) for 
about one minute.

with -Djava.net.preferIPv4Stack=true, timed wait connections are on tcp (no 
more tcp6).
In the dialog with GraphiteWriter, did there is some sort of End Of Dialog 
Message to be sent ?

Note I'm using JMXTrans with -e -s 60 (continuous run, 60s interval)

If I run JMXTrans in one shot mode (ie: without -e -s 60), I get about 272 
connections in TIME_WAIT

Original comment by henri.gomez on 27 May 2011 at 4:36

GoogleCodeExporter commented 8 years ago
More on this.

When I start JMXTrans from another machine, I don't get these.
So it's something related to JMXTrans and Graphite (Carbon), when hosted on the 
same box.

Original comment by henri.gomez on 27 May 2011 at 4:44

GoogleCodeExporter commented 8 years ago
Graphite doesn't have a dialog. You open a socket, send some data, close the 
socket. That is exactly what I'm doing. It seems something is off with the 
configuration of your box or something. If you have an option that works for 
you with the -D, then I'd say use it. =)

Looking on a box I'm running with Ubuntu 10.04, I see a lot of TIME_WAIT's, but 
it appears to be between jmxtrans and the JMX ports on remote servers. 

tcp6       0      0 10.0.5.42%3510517:35740 app03-int:58363         TIME_WAIT  
tcp6       0      0 10.0.5.42%3510517:42101 app11-int:60187         TIME_WAIT  
tcp6       0      0 10.0.5.42%3510517:58072 app12-int:1101          TIME_WAIT  
tcp6       0      0 10.0.5.42%3510517:37927 olp02-int:35339         TIME_WAIT  

I'll have a look at that code again to make sure things are getting closed 
properly there as well.

Original comment by latch...@gmail.com on 27 May 2011 at 9:13

GoogleCodeExporter commented 8 years ago
May be the problem is on the carbon side.
When I stopped the JMXTrans, I still see the connections via netstat so they 
are kept by the server side. 

Also, I don't understand why I don't get such behaviour when JMXTrans is on 
another box.

Original comment by henri.gomez on 28 May 2011 at 6:39

GoogleCodeExporter commented 8 years ago
Not sure there is much more I can do here. If you find anything else that might 
help, I'd be happy to make code changes around this.

Original comment by latch...@gmail.com on 31 May 2011 at 10:00

GoogleCodeExporter commented 8 years ago
Added socket pooling.

Original comment by latch...@gmail.com on 24 Jun 2011 at 6:29

GoogleCodeExporter commented 8 years ago
I just noticed that the JmxConnections need to be pooled too... those are 
sitting in TIME_WAIT as well... I'll get to that soon.

Original comment by latch...@gmail.com on 24 Jun 2011 at 6:33