hierynomus / smbj

Server Message Block (SMB2, SMB3) implementation in Java
Other
708 stars 181 forks source link

Handling of referral timeout in DFSPathResolver (Could not find referral cache entry) #342

Open jborza opened 6 years ago

jborza commented 6 years ago

Hi,

I've been getting this exception accessing DFS shares

java.lang.IllegalStateException: Could not find referral cache entry for DFSPath{[server.domain.com, dfs]}
        at com.hierynomus.smbj.paths.DFSPathResolver.step9(DFSPathResolver.java:291)
    at com.hierynomus.smbj.paths.DFSPathResolver.step2(DFSPathResolver.java:127)
    at com.hierynomus.smbj.paths.DFSPathResolver.step1(DFSPathResolver.java:107)
    at com.hierynomus.smbj.paths.DFSPathResolver.resolve(DFSPathResolver.java:94)
    at com.hierynomus.smbj.paths.DFSPathResolver.resolve(DFSPathResolver.java:75)
    at com.github.jborza.camel.component.smbj.dfs.DfsResolver.resolve(DfsResolver.java:31)

not really easy to reproduce, but it usually occurs in two scenarios:

  1. multiple threads were accessing the share at same time and this happened seemingly at random after some time
  2. single thread was accessing the share many times (recursive listing of folders) and this eventually

According to the spec ( https://msdn.microsoft.com/en-us/library/cc227028.aspx ) step9 should occur only if step2 finds out that the item is expired. Step2 actually checks whether a state.path is expired

        ReferralCache.ReferralCacheEntry lookup = referralCache.lookup(state.path);
        ...
        if (lookup.isExpired()) { // Expired LINK target
            return step9(session, state, lookup); // Resolve Link Referral
        }

and step9 looks up just the root

        DFSPath rootPath = new DFSPath(state.path.getPathComponents().subList(0, 2));
        ReferralCache.ReferralCacheEntry rootReferralCacheEntry = referralCache.lookup(rootPath);
        if (rootReferralCacheEntry == null) {
            throw new IllegalStateException("Could not find referral cache entry for " + rootPath);
        }

Technically when the lookup exists in the cache, the root should exist as well, as lookup is something like server.domain.com/dfs/folder/subfolder

Question 1: Is smbj SMBClient intended to be used from a single thread? At least the DFS cache bits seem thread-safe due to the use of volatile entry and AtomicReferenceFieldUpdater. Question 2: Is DiskShare intended to be used for just a single operation? Or should I be able to do things like recursive directory synchronization (lot of .list, .folderExists, .fileExists) from a single DiskShare from a single Connection / Session? If the answer to Q2 is yes, are you aware of any reason why this would blow up?

jborza commented 6 years ago

More information after debugging - it seems that whenever the dfs root entry expires in the cache, and the code goes through step 9 and lookup(rootPath)

Then the state.path from step 2 points to the complete folder server.domain.com/dfs/folder/subfolder, but state.path.getPathComponents().subList(0, 2) is just server.domain.com/dfs, which, in the cache, has a null entry. Only the end

I'm still somehow suspecting that we are supposed to store the root Referral cache entry (first two path components - server.domain.com/dfs) so it can be retrieved later at step 9.

as the DFS spec says Find the root ReferralCache entry corresponding to the first two path components, noting that this will already be in the cache due to processing that resulted in acquiring the expired link ReferralCache entry.

I'll try to read the DFS spec some more to see when should this root referral cache entry be saved - either at step 6 that sends a root request or somewhere else. Section 3.1.5.3.3 https://msdn.microsoft.com/en-us/library/cc227036.aspx may cover this

hierynomus commented 6 years ago

Hi @jborza,

To answer the first question, yes it is supposed to be threadsafe for most of the API parts that make sense (at least the connection, session, transport, share, file/directory). So any bug you run into there should be something that we need to handle.

The problem with DFS is that the spec is slightly incomprehensible (the MS-SMB2 spec is waaayyy better). And I don't have a good/decent DFS test setup at my disposal.. I would love if we had the ability to have docker containers with windows with smb/dfs on there, unfortunately that still not possible as far as I'm aware.

jborza commented 6 years ago

Hi @hierynomus , I'm mostly testing on real infrastructure, I also have an Azure cluster with 3 servers and working DFS.

It's also possible to test on Samba, in theory, as it supports DFS, but I suspect the majority of use cases would be with Windows DFS, so this would probably make little sense.

I haven't succeeded with Docker containers, gave up as I wasn't able to install SMB onto a Windows Server container a couple of months ago, so that would be where I'd test it.

jborza commented 6 years ago

I did read the relevant parts of the spec (3.1.4.1 steps 5 and 6 and 9) and (3.1.5.3.3) back and front and don't see anything clear about when to cache the root entry that the step 9 references.

Checked other implementations, jcifs maintains a root entry cache and cifs in Linux doesn't seem to use cache at all.

I just assume at this point that if the expiry should work at all, the root entry should be stored in DFSPathResolver.handleRootOrLinkReferralResponse().

I'll try to test it against DFS setup and submit a PR.

hierynomus commented 6 years ago

Thanks! I've found a samba dfs docker container that I'll use to at least setup s9me integration tests against

Op wo 20 jun. 2018 19:20 schreef Juraj Borza notifications@github.com:

I did read the relevant parts of the spec (3.1.4.1 steps 5 and 6 and 9) and (3.1.5.3.3) back and front and don't see anything clear about when to cache the root entry that the step 9 references.

Checked other implementations, jcifs maintains a root entry cache and cifs in Linux doesn't seem to use cache at all.

I just assume at this point that if the expiry should work at all, the root entry should be stored in DFSPathResolver.handleRootOrLinkReferralResponse().

I'll try to test it against DFS setup and submit a PR.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hierynomus/smbj/issues/342#issuecomment-398829395, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHLo7zbJle8jnDl6OhAOfHbHanKEJL4ks5t-oRTgaJpZM4UsGWt .

jborza commented 6 years ago

note: I updated the name of the issue as it has nothing to do with thread safety, but timeout handling, respectively doing something after the referral so the referral timeout is handled as specified in the spec

hierynomus commented 6 years ago

You could now try to write an integration test against the samba container for thi scenario.

sahanjay commented 6 years ago

@jborza This scenario could happen if you use same SMBClient for some times,

In first request, it setup "ReferralCache" and set expire time, but it won't update expires value in next requests, if you use same SMBClient, even though new connection or session has been created.

code : this.expires = System.currentTimeMillis() + this.ttl * 1000L;

I have tested this with creating new connections and sessions, but it won't update expires value or API doesn't provide access to this value also.

As a solution, i setup client for each request and close it. :) I hope this will help you guys.

Note :

if you use Spring, you need to inject SMBClient as prototype scope bean to singleton bean :)

cliviu commented 5 years ago

hi, I have the same problem, several times and non-deterministic with version 0.9.1

Caused by: java.lang.IllegalStateException: Could not find referral cache entry for DFSPath{[....com, ....]} at com.hierynomus.smbj.paths.DFSPathResolver.step9(DFSPathResolver.java:304) at com.hierynomus.smbj.paths.DFSPathResolver.step2(DFSPathResolver.java:140) at com.hierynomus.smbj.paths.DFSPathResolver.step1(DFSPathResolver.java:120) at com.hierynomus.smbj.paths.DFSPathResolver.resolve(DFSPathResolver.java:107) at com.hierynomus.smbj.paths.DFSPathResolver.resolve(DFSPathResolver.java:95) at com.hierynomus.smbj.share.DiskShare.resolveAndCreateFile(DiskShare.java:77) at com.hierynomus.smbj.share.DiskShare.open(DiskShare.java:66) at com.hierynomus.smbj.share.DiskShare.exists(DiskShare.java:187) at com.hierynomus.smbj.share.DiskShare.folderExists(DiskShare.java:183)

andershermansen commented 5 years ago

Duplicate/same as #310 ?

Disablez commented 5 years ago

Please check if https://github.com/hierynomus/smbj/pull/474 does some good about this issue.