andriytk / ceph-old

Other
1 stars 0 forks source link

s3cmd get is not working on versioned buckets #8

Closed andriytk closed 2 years ago

andriytk commented 2 years ago

Steps to reproduce:

  1. Create a bucket and enable versioning on it:
    $ s3cmd --no-ssl mb s3://testbucket8
    Bucket 's3://testbucket5/' created
    $ aws s3api put-bucket-versioning --endpoint-url http://localhost:8000 --bucket testbucket8 --versioning-configuration Status=Enabled
  2. Write an object to it and try to get it back:
    $ s3cmd --no-ssl put /tmp/file1 s3://testbucket8/file1
    upload: '/tmp/file1' -> 's3://testbucket8/file1'  [1 of 1]
    15 of 15   100% in    0s  1974.98 B/s  done
    $ s3cmd --no-ssl get s3://testbucket8/file1 /tmp/file1.check
    download: 's3://testbucket8/file1' -> '/tmp/file1.check'  [1 of 1]
    ERROR: Download of '/tmp/file1.check' failed (Reason: 404 (NoSuchKey))
    ERROR: S3 error: 404 (NoSuchKey)
huanghua78 commented 2 years ago

From the "versioning demo", I see this test has already passed. Is this already fixed?

andriytk commented 2 years ago

No. In the demo we were getting the object data with aws s3api get-object cmd, not with s3cmd.

andriytk commented 2 years ago

Currently, if we list all the versioned objects with aws s3api list-object-versions cmd - they all will have IsLatest attribute set to true, and that's probably the root cause of this issue. Also, if we list the objects with s3cmd - all versions will be listed instead of the latest one (as it does in RGW+RADOS case).

To fix this, we should probably implement this: https://github.com/andriytk/ceph/blob/66e3defe3ddc67df330d02d6c58880b5fce9a5e4/src/rgw/rgw_sal_motr.cc#L1711-L1714

In other words, we should unset the rgw_bucket_dir_entry::FLAG_CURRENT for the last version of the object when we write the new one.

siningwuseagate commented 2 years ago

Apart from the issue found above, there are some other issues: (1) When listing a versioning-enabled bucket, if an object has multiple versions, only the latest one should be displayed if list-versions option is off, not all versions (confirmed with the rados backend). Our implementation shows all versions. (2) According aws specification, when deleting an object without specifying the version, a delete marker is created. Our implementation doesn't do it currently. RGW has a data structure called Object Logic Head (olh) to track the current version of an object. Check the code to see if we need to make use of olh to fix the issues above and how to do it if it is needed.

andriytk commented 2 years ago

Fixed by #17.