jborean93 / smbprotocol

Python SMBv2 and v3 Client
MIT License
323 stars 73 forks source link

[DFS server]: Infinite recursion due to lstat for a non-existent file #228

Closed karan-lb closed 1 year ago

karan-lb commented 1 year ago

Hi, we have a 3 node DFS SMB server that we are trying to traverse. The DFS server exposes 4 namespaces that are fully replicated over all nodes. This means that every node holds all the files that are present in the namespace. When we try to lstat a file which exists, the code is working well and we don't see any issue. The DFS referrals are not involved there as all files do exist on each node.

Issue: When we try to lstat a file path that does not exist on the DFS server the smbclient gets into an infinite recursion. The trace for which looks as follows: Notice the frame repeats till we interrupt the process. File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 335, in commit self.commit() [Previous line repeated 38 more times]

  File "download_file.py", line 35, in stat_dfs_file
    smbclient.lstat(path)
  File "/usr/local/lib/python3.7/site-packages/smbclient/_os.py", line 259, in lstat
    return stat(path, follow_symlinks=False, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/smbclient/_os.py", line 592, in stat
    query_info(transaction, FileAttributeTagInformation)
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 258, in __exit__
    self.commit()
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 335, in commit
    self.commit()
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 335, in commit
    self.commit()
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 335, in commit
    self.commit()
  [Previous line repeated 38 more times]
  File "/usr/local/lib/python3.7/site-packages/smbclient/_io.py", line 294, in commit
    res = func(requests[idx])
  File "/usr/local/lib/python3.7/site-packages/smbprotocol/open.py", line 1166, in _create_response
    response = self.connection.receive(request)
  File "/usr/local/lib/python3.7/site-packages/smbprotocol/connection.py", line 1008, in receive
    if not request.response_event.wait(timeout=iter_timeout):
  File "/usr/local/lib/python3.7/threading.py", line 552, in wait
    signaled = self._cond.wait(timeout)
  File "/usr/local/lib/python3.7/threading.py", line 296, in wait
    waiter.acquire()
KeyboardInterrupt

We notice that the referrals being returned by the server keep rotating between a couple of nodes and its gets into this infinite loop. Server A refers to Server B and then it refers back and so forth. Since the namespace does indeed exist on all three nodes that does make sense, but for deleted files there is no condition to make the recursion stop. (Ideally it should stop when we get to a node that we have already been referred before?)

Possible solution: We did notice that the ClientConfig does contain an option skip_dfs to avoid processing any referrals, but that option doesn't work for our case. We still see infinite recursion happening.

If the skip_dfs option worked, it would solve the crash for our case as any single node of the DFS server is fully capable to serve all files in the server.

So, we notice in the method, we can return early if the tree connect is not a DFS share. So we plan to toggle the is_dfs_share capability returned by the server to False when ClientConfig has skip_dfs set to True by the client.

def _resolve_dfs(raw_io):
    if not raw_io.fd.tree_connect.is_dfs_share:
        return

We do the above by modifying the following code in _pool.py

def get_smb_tree(
    path, username=None, password=None, port=445, encrypt=None, connection_timeout=60, connection_cache=None
):
    client_config = ClientConfig()
...

    if not tree:
        tree = TreeConnect(session, share_path)
        use_dfs = not client_config.skip_dfs
        try:
            tree.connect(require_secure_negotiate=client_config.require_secure_negotiate, use_dfs=use_dfs)
        except BadNetworkName as err:
            # If the server doesn't mention it supports DFS then don't try to
            # resolve the DFS path.
            if not session.connection.server_capabilities.has_flag(Capabilities.SMB2_GLOBAL_CAP_DFS) or not use_dfs:
                raise

and in tree.py :

class TreeConnect(object):
...
    def connect(self, require_secure_negotiate=True, use_dfs=True):
...
        capabilities = tree_response["capabilities"]
        self.is_dfs_share = capabilities.has_flag(ShareCapabilities.SMB2_SHARE_CAP_DFS) and use_dfs

This does help us in honoring the skip_dfs field, but we wanted to know

  1. Whether this problem is known?
  2. What could a potential solution look like?
  3. Would it help to send a small patch for honoring skip_dfs field?
jborean93 commented 1 year ago

Sorry I'm not ignoring this, just haven't had time to look into it. Unfortunately DFS is a complex topic so might take some time to try and replicate and figure out where the problem is.

karan-lb commented 1 year ago

@jborean93 please let me know if you need any details about the setup. It was easy to repro this in-house using a 2 node setup with a fully replicated DFS namespace across both nodes.

jborean93 commented 1 year ago

Sorry for the delay getting to this issue, the PR https://github.com/jborean93/smbprotocol/pull/239 fixes the recursion problem by ensuring it won't retry a target that has already been tried. I was able to replicate the problem in my environment and I can confirm the changes in that PR ensure the correct exception is raised.