intel-cloud / cosbench

a benchmark tool for cloud object storage service
Other
573 stars 242 forks source link

Controller does't close socket #314

Closed dburnazyan closed 8 years ago

dburnazyan commented 8 years ago

Controller does't close socket in PingDriverRunner (perhaps) class so after while too many CLOSE_WAIT connection appear and controller stop working because it can't open new socket.

root@cosbench:~# status cosbench-controller 
cosbench-controller start/running, process 22856
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT | grep 22856/java | wc -l
28233

PingDriverRunner

...
    private void pingDrivers(DriverInfo[] driverInfos) {
        for (DriverInfo driver : driverInfos) {
            boolean isAlive = false;

            String ipAddress = getIpAddres(driver.getUrl());
            Integer port = getDriverPort(driver.getUrl());
            try {
                if (!ipAddress.isEmpty()) { 
                    try{
                        Socket socket = new Socket();
                        InetSocketAddress reAddress = new InetSocketAddress(ipAddress, port);
                        InetSocketAddress locAddress = new InetSocketAddress("0.0.0.0", 0);
                        socket.bind(locAddress);
                        socket.connect(reAddress,3000);
                        isAlive = true;
                        }catch(Exception e){
                            isAlive = false;
                        }
                }
            }finally{
                driver.setAliveState(isAlive);
            }
        }
    }
...

Do we need socket.close(); after socket.connect() ?

dburnazyan commented 8 years ago

after restarting controller connection in CLOSE_WAIT state became growing

root@cosbench:~# restart cosbench-controller 
cosbench-controller start/running, process 26232
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l                                                                                                                                                                                   
0
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
18
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
18
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
21
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
21
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
147
dburnazyan commented 8 years ago

This commit fix it for me https://github.com/intel-cloud/cosbench/pull/315/commits/1f1ca1441f822466c12196ab0be5d5cec7ceddf9

dburnazyan commented 8 years ago

Before:

root@cosbench:~# date
Wed May 11 06:48:31 EDT 2016
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
39
root@cosbench:~# date
Wed May 11 06:50:17 EDT 2016
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
102

After:

root@cosbench:~# date
Wed May 11 06:51:46 EDT 2016
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
0
root@cosbench:~# date
Wed May 11 06:55:01 EDT 2016
root@cosbench:~# netstat -tuapn | grep CLOSE_WAIT  | wc -l
0