Upplication / Amazon-S3-FileSystem-NIO2

An S3 File System Provider for Java 7
MIT License
122 stars 67 forks source link

Improved listing of buckets with large number of items #34

Closed pditommaso closed 9 years ago

pditommaso commented 9 years ago

The current implementation is very inefficient when listing the content of a bucket that contains a large number of items organised in many subdirectory.

The proposed PR address this issue using the ObjectListing. getCommonPrefixes method making it possible to list and navigate big buckets.

Unfortunately I was unable to fix the tests in the S3Iterator class. Mockito library is not my piece of cake. I hope a core committer can help on making the tests pass.

pditommaso commented 9 years ago

I've further optimized the remote AWS API invocations with this commit @c7abd6c. Mainly it does two things:

These changes decrease a lot the number of requests when traversing a big directory structure with many files.

By side-effect this solve also the NPE issue #28. Indeed now it returns correctly the timestamps data for a bucket object.

As sidenote, now S3ObjectSummaryLookup contains a single method. I would suggest to move it into another class and remove S3ObjectSummaryLookup class.

pditommaso commented 9 years ago

I've realised that the above patch was returning an invalid ObjectSummary when the S3 path specify the bucket root. Actually, the bucket does not have any lastAccess/lastModified timestamp information.

I've pushed the commit @9440808 to handle it correctly.

jarnaiz commented 9 years ago

thanks for all, this PR is closed by #47