dCache / nfs4j

Pure Java NFSv3 and NFSv4.2 implementation
Other
240 stars 76 forks source link

Directory listing always traverse all elements #61

Closed aasenov closed 6 years ago

aasenov commented 6 years ago

I'm using nfs4j 17.3, but as far as I see the issue is present in latest version. The problem is in DirectoryStream initialization:

  1. If I use constructor with Collection, then it's transformed to TreeSet, which call "addAll" method and thus traversing all elements.
  2. If I use constructor with NavigableSet, the collection is not traversed during initialization, but it is from PseudoFS -> list method, where the set is transformed to collection and passed to new DirectoryStream instance, that converts it to TreeSet (the first option). So no matter what I use, it always come down to Collection to Set transformation, that require traversing all elements. This is problematic, as if we have folder with 10k items, I can't return them on demand, as all of them will be loaded in the memory for every listing.

Even if I use VFSCache, every time all elements from cookie start point till the end will be processed.

kofemann commented 6 years ago

Good catch! In did, even NavigableSet is turned into collection. This is obviously id not the intended behavior.

kofemann commented 6 years ago

@aasenov check with my test branch:

https://github.com/kofemann/nfs4j/tree/no-list-copy-on-transform

aasenov commented 6 years ago

Yes - it seems OK. I've tried it on my environment and it works as expected. I had some doubts about stream and spliterator implementation as it uses some buffering, but it was ok. One thing to mention - after calling transform, elements retrieved trough tail() won't have mapping function applied, as it's applied only when entries are retrieved trough iterator(), but in tail(long fromCookie), they're accessed directly from collection. I think that won't be problem now, as this method is used only from VfsCache and the mapping function should be applied afterwards - but just to make sure you double check it.

I still wonder isn't it simpler, instead of converting to stream -> mapping -> back to iterator, just to implement iterator interface, where next() will apply the function before returning the element - as this is actually what you want the method to do:

    <public Iterator<DirectoryEntry> iterator() {
            return new Iterator<DirectoryEntry>() {
                private Iterator<DirectoryEntry> mInner = entries.iterator();

                @Override
                public DirectoryEntry next() {
                    return function.apply(mInner.next());
                }

                @Override
                public boolean hasNext() {
                    return mInner.hasNext();
                }

                @Override
                public void remove() {
                    mInner.remove();
                }
            };
        }
kofemann commented 6 years ago

One thing to mention - after calling transform, elements retrieved trough tail() won't have mapping function applied, as it's applied only when entries are retrieved trough iterator(), but in tail(long fromCookie), they're accessed directly from collection. I think that won't be problem now, as this method is used only from VfsCache and the mapping function should be applied afterwards - but just to make sure you double check it.

👍

This have to be fixed as well.

kofemann commented 6 years ago

@aasenov I believe both issues are addressed

aasenov commented 6 years ago

Iterator implementation is OK, but I don't see the point of using ForwardingNavigableSet, as if you call tailSet() it will delegate method execution and the result will be : DirectoryStream.this.entries.tailSet() Which won't apply the mapping function too?

aasenov commented 6 years ago

I think the best way will be to add list of functions to apply as DirectoryStream field and apply them when requested - simple change attached. DirectoryStream.zip

aasenov commented 6 years ago

Yes - I was thinking for the same approach, but there is still corner case where we can miss function applying:

  1. If you call transform(), then call tailSet() and then again tailSet() the last Set won't have function applied, because you can't overwrite all nested instances of tailSet(). That's why I proposed to have list of functions to apply - to avoid possibility of such corner cases and I think it's simpler
kofemann commented 6 years ago

I will prefer to avoid en extra constructor

aasenov commented 6 years ago

Without extra constructor - the last commit work on my environment and seems to be OK for all cases (the tests verify that).

aasenov commented 6 years ago

I wanted to ask one more question about directory listing, cause it behaves strange for me: When you list large folders, the list() method is called relatively for every 100 elements. I keep track of elements already retrieved from the set and the once that are left for retrieving. But every time a sequential list() is called, the cookie is not the expected one. For example, on first list() call, 116 elements are retrieved from the NavigableSet, but the next list() call is with cookie = 100 and I have to move my pointer backwards - is that expected behaviour and is there a way to make listing sequential (or that's client problem)?

kofemann commented 6 years ago

this sounds like client issue. However, there may be something in the reply that triggers such behavior.

aasenov commented 6 years ago

I've tried playing with mount options, but without luck. I'm using simple mount command, that mount the server with the following options: type nfs4 (rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=**,local_lock=none,addr=) When I put org.dcache.nfs.v4.OperationREADDIR on debug, i saw the following result: Sending 115 entries (32556 bytes from 32680, dircount = 8170) cookie = 0 EOF=false Sending 115 entries (32552 bytes from 32768, dircount = 8192) cookie = 100 EOF=false Sending 115 entries (32540 bytes from 32768, dircount = 8192) cookie = 202 EOF=false Sending 115 entries (32554 bytes from 32768, dircount = 8192) cookie = 304 EOF=false


Every time the cookie received is not equal to initial cookie + entriesSend. Can you try at your end, to check whether that's something on my end or it's current behaviour.

This's an issue on my end, as I use database like file system, and when directory contain thousands of entries, I use lazy collection of entries, loading only 1k files in memory. When they're exhausted, I remove them and load next 1k. I may hit a case, where I move to next batch of 1k files, but the client want to continue listing from element, that was in previous 1k batch. If you want I can open a new bug, as this is different issue?

kofemann commented 6 years ago

I can observe the same behavior as you. I will investigate why this happens.

aasenov commented 6 years ago

Is there any progress with your investigations?

kofemann commented 6 years ago

Not really, I have to debug kernel to understand the behavior.

kofemann commented 6 years ago

please open a new issue.

aasenov commented 6 years ago

Ready: https://github.com/dCache/nfs4j/issues/62