scality / Arsenal

Common utilities for the open-source Scality S3 project components
Apache License 2.0
15 stars 19 forks source link

Listing Implementation Issue #147

Closed vrancurel closed 7 years ago

vrancurel commented 8 years ago

versions of the product: all affects: bucketfile and bucketclient backend

We should be able to list with both a 'prefix' and a 'marker' while having a delimiter.

Get now, _getStartIndex() masks the prefix if you have a marker.

E.g.

?prefix=X11/&marker=X11%2FResConfigP.h

gives for file names

Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  3.6 kB         ResourceI.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  2.9 kB         SM/SM.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  11.0 kB        SM/SMlib.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  4.7 kB         SM/SMproto.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  5.1 kB         SelectionI.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  17.0 kB        Shell.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  0.2 kB         ShellI.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  12.4 kB        ShellP.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  29.7 kB        StringDefs.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  3.9 kB         Sunkeysym.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  4.2 kB         ThreadsI.h
Mon Aug 29 2016 23:53:50 GMT+0200 (CEST)  16.8 kB        TranslateI.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.3 kB         VarargsI.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.7 kB         Vendor.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  3.5 kB         VendorP.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  19.7 kB        X.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  13.2 kB        XF86keysym.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  30.3 kB        XKBlib.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  3.9 kB         XWDFile.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  4.5 kB         Xalloca.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.9 kB         Xarch.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.5 kB         Xatom.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  3.7 kB         Xauth.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  20.8 kB        Xcms.h
Mon Aug 29 2016 23:53:51 GMT+0200 (CEST)  2.3 kB         Xdefs.h

As we see some CommonPrefixes are listed as files.

The bug is located here: https://github.com/scality/Arsenal/blob/master/lib/algos/list/delimiter.js

It seems there are other inconsistencies in the code.

Please provide consistent functional (for bucketfile) and end-to-end (for metadata) test scenarios.

ghost commented 8 years ago

As there is no "delimiter" parameter in the query string, I do not see what's wrong in the answer here. Is seems expected that we'd list SM/* since no delimiter was specified. Did we misunderstand something ?

Anyhow, Michael found a sample request on the amazon doc that does not seem to work with our implementation. He'll be fixing that first.

ghost commented 8 years ago

Just to be sure, I'll take an example straight from the documentation: For this example, we assume that we have the following keys in our bucket:

The following GET request specifies the delimiter parameter with the value /, and the prefix parameter with the value photos/2006/.

GET /?prefix=photos/2006/&delimiter=/ HTTP/1.1
Host: example-bucket.s3.amazonaws.com
Date: Wed, 01 Mar  2006 12:00:00 GMT
Authorization: authorization string

In response, Amazon S3 returns only the keys that start with the specified prefix. Further, it uses the delimiter character to group keys that contain the same substring until the first occurrence of the delimiter character after the specified prefix. For each such key group Amazon S3 returns one <CommonPrefixes> element in the response. The keys grouped under this CommonPrefixes element are not returned elsewhere in the response. The value returned in the CommonPrefixes element is a substring from the beginning of the key to the first occurrence of the specified delimiter after the prefix. Which means we should get that:

<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
  <Name>example-bucket</Name>
  <Prefix>photos/2006/</Prefix>
  <Marker></Marker>
  <MaxKeys>1000</MaxKeys>
  <Delimiter>/</Delimiter>
  <IsTruncated>false</IsTruncated>

  <CommonPrefixes>
    <Prefix>photos/2006/February/</Prefix>
  </CommonPrefixes>
  <CommonPrefixes>
    <Prefix>photos/2006/January/</Prefix>
  </CommonPrefixes>
</ListBucketResult>

It is not the case at the moment (at least when using the memory backend).

ghost commented 8 years ago

Note that we observed that the S3 memory backend does not seem to make use of the listing algorithms, meaning an increased risk of missing mistakes, and creating out-of-sync situations between S3 and Arsenal.

On Tue, Aug 30, 2016 at 3:54 PM, Michael Zapata notifications@github.com wrote:

Just to be sure, I'll take an example straight from the documentation: For this example, we assume that we have the following keys in our bucket:

  • sample.jpg
  • photos/2006/January/sample.jpg
  • photos/2006/February/sample2.jpg
  • photos/2006/February/sample3.jpg
  • photos/2006/February/sample4.jpg

The following GET request specifies the delimiter parameter with the value /, and the prefix parameter with the value photos/2006/.

GET /?prefix=photos/2006/&delimiter=/ HTTP/1.1Host: example-bucket.s3.amazonaws.comDate: Wed, 01 Mar 2006 12:00:00 GMTAuthorization: authorization string

In response, Amazon S3 returns only the keys that start with the specified prefix. Further, it uses the delimiter character to group keys that contain the same substring until the first occurrence of the delimiter character after the specified prefix. For each such key group Amazon S3 returns one element in the response. The keys grouped under this CommonPrefixes element are not returned elsewhere in the response. The value returned in the CommonPrefixes element is a substring from the beginning of the key to the first occurrence of the specified delimiter after the prefix. Which means we should get that:

example-bucket photos/2006/ 1000 / false photos/2006/February/ photos/2006/January/

It is not the case at the moment.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/scality/Arsenal/issues/147#issuecomment-243446148, or mute the thread https://github.com/notifications/unsubscribe-auth/ANpZZ9lwR2ZajGQ_Dxdr3wBG7xfaxuqpks5qlDYqgaJpZM4JwAzC .

David Pineau Scality R&D Engineer

http://bit.ly/2aKbaTu

ghost commented 8 years ago

There is a preliminary PR taking care of one bug, could we have more context about the issue at hand if this one doesn't handle that issue?