hierynomus / smbj

Server Message Block (SMB2, SMB3) implementation in Java
Other
705 stars 179 forks source link

Stale connections in SMBClient.connectionTable #796

Open turcsanyip opened 11 months ago

turcsanyip commented 11 months ago

SMBClient stores the established connections in connectionTable for caching. Before returning a cached connection, it checks if the connection is still alive calling Connection.isConnected(). This method relies on the underlying Socket but in Java the Socket may not be aware that the other end has been disconnected (e.g. server restart). Socket.isConnected() / .isClosed() are related to the Socket lifecycle controlled from the code which not necessary reflects the network events. Even if these methods return "connected", the current state can be "broken pipe" which is only revealed when the client tries to write something on the wire.

So SMBClient can return a stale/dead connection from the cache. It is not an issue when a simple SMB share is used because the first thing is to open a Session on the connection and it will fail immediately. In this case the Connection can be closed and reopened from the application code. However, in case of DFS the smbj library opens and maintains connections under the hood and the application only knows the connection to the namespace server. This "main" connection can be fine while other connections to the data servers are already dead.

I think it would make sense to extend the connection check with some "wire test" and SMB Echo message seems good to me for this purpose.

What do you think about using something like this?

    public boolean isConnected() {
        if (!transport.isConnected()) {
            return false;
        }

        try {
            send(new SMB2Echo(getNegotiatedProtocol().getDialect())).get(10, TimeUnit.SECONDS);
        } catch (Exception e) {
            return false;
        }

        return true;
    }

The application side workaround would be to close the whole SMBClient (with all connections in the cache) if failure happens because it is not possible to determine and access the problematic background connection from the application code. It is overkill because all connections would be closed and potentially live sessions would be aborted.