noobaa / noobaa-core

High-performance S3 application gateway to any backend - file / s3-compatible / multi-clouds / caching / replication ...
https://www.noobaa.io
Apache License 2.0
265 stars 78 forks source link

NSFS | S3 | Versioning: Deleting object versions except the latest breaks ListObjectVersions #8314

Closed hseipp closed 4 days ago

hseipp commented 2 weeks ago

Environment info

Actual behavior

ListObjectVersions does not return a VersionId for last remaining version after all previous versions of a key are deleted. Instead, ListObjectVersions only returns:

{
    "RequestCharged": null
}

Expected behavior

ListObjectVersions should report correct versioning information for the object. Example:

{
    "Versions": [
        {
            "ETag": "\"mtime-d3qrmcqwi1og-ino-105s\"",
            "Size": 9,
            "StorageClass": "STANDARD",
            "Key": "testobj",
            "VersionId": "mtime-d3qrmcqwi1og-ino-105s",
            "IsLatest": true,
            "LastModified": "2024-08-27T14:31:45+00:00",
            "Owner": {
                "DisplayName": "NooBaa",
                "ID": "123"
            }
        }
    ],
    "RequestCharged": null
}

Steps to reproduce

Execute Ceph s3-tests, function test_versioning_obj_create_read_remove()

S3TEST_CONF=s3tests.conf tox -- s3tests_boto3/functional/test_s3.py::test_versioning_obj_create_read_remove

Alternatively run the following python script (adjust profile and certificate to your needs):

#!/usr/bin/python
import boto3
session = boto3.Session(profile_name='s3user1')
endpoint = 'https://localhost:6443'
cos = session.client('s3', endpoint_url=endpoint, verify='/home/vagrant/aws-cert/tls.crt')
bucket_name = "s3tests-gr7nhf7l22gctexhbn5kw-1"
response = cos.create_bucket(Bucket=bucket_name)
response = cos.put_bucket_versioning(Bucket=bucket_name, VersioningConfiguration={'Status': 'Enabled'})
read_status = None
expected_string  = 'Enabled'
for i in range(5):
  try:
    response = cos.get_bucket_versioning(Bucket=bucket_name)
    read_status = response['Status']
  except KeyError:
    read_status = None
  if (expected_string == read_status):
            break
  time.sleep(1)
assert expected_string == read_status
version_ids = []
for i in range(5):
  data = 'content-{i}'.format(i=i)
  response = cos.put_object(Bucket=bucket_name, Key='testobj', Body=data)
  version_id = response['VersionId']
  version_ids.append(version_id)

for i in range(5):
  rm_version_id = version_ids.pop(0)
  cos.delete_object(Bucket=bucket_name, Key='testobj', VersionId=rm_version_id)
  response = cos.list_object_versions(Bucket=bucket_name)
  print("Versions: {}".format(response['Versions']))

More information - Screenshots / Logs / Other output

Output of the Python script with the error:

Versions: [{'ETag': '"mtime-d3rk3k4prklc-ino-nbe"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k4prklc-ino-nbe', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}, {'ETag': '"mtime-d3rk3k4fk2yo-ino-nbd"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k4fk2yo-ino-nbd', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}, {'ETag': '"mtime-d3rk3k3xc6ps-ino-nbc"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k3xc6ps-ino-nbc', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}, {'ETag': '"mtime-d3rk3k37arcw-ino-nba"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k37arcw-ino-nba', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}]
Versions: [{'ETag': '"mtime-d3rk3k4prklc-ino-nbe"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k4prklc-ino-nbe', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}, {'ETag': '"mtime-d3rk3k4fk2yo-ino-nbd"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k4fk2yo-ino-nbd', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}, {'ETag': '"mtime-d3rk3k3xc6ps-ino-nbc"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k3xc6ps-ino-nbc', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}]
Versions: [{'ETag': '"mtime-d3rk3k4prklc-ino-nbe"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k4prklc-ino-nbe', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}, {'ETag': '"mtime-d3rk3k4fk2yo-ino-nbd"', 'Size': 9, 'StorageClass': 'STANDARD', 'Key': 'testobj', 'VersionId': 'mtime-d3rk3k4fk2yo-ino-nbd', 'IsLatest': True, 'LastModified': datetime.datetime(2024, 8, 28, 12, 50, 45, tzinfo=tzlocal()), 'Owner': {'DisplayName': 'NooBaa', 'ID': '123'}}]
Traceback (most recent call last):
  File "create-versioned-bucket-upload-object_multiple.py", line 33, in <module>
    print("Versions: {}".format(response['Versions']))
KeyError: 'Versions'

Noobaa log with "all" mode enabled:

noobaa_20240828_1455.log.gz

romayalon commented 2 weeks ago

@nadavMiz can you please take a look?

hseipp commented 2 weeks ago

Please note that the same error occurs when using Ceph s3-tests test_versioning_obj_create_versions_remove_all():

S3TEST_CONF=s3tests.conf tox -- s3tests_boto3/functional/test_s3.py::test_versioning_obj_create_versions_remove_all
...
client = <botocore.client.S3 object at 0x7f29f51579d0>, bucket_name = 's3tests-l07lrjq90dn6imxsfi5dj-1', key = 'testobj', version_ids = ['mtime-d3ro2a9o7f28-ino-1es0']
contents = ['content-9']

    def check_obj_versions(client, bucket_name, key, version_ids, contents):
        # check to see if objects is pointing at correct version

        response = client.list_object_versions(Bucket=bucket_name)
        versions = []
>       versions = response['Versions']
E       KeyError: 'Versions'
hseipp commented 2 weeks ago

And another variant of the same issue - Ceph s3-tests test_versioning_obj_create_versions_remove_special_names():

contents = ['content-9']

    def check_obj_versions(client, bucket_name, key, version_ids, contents):
        # check to see if objects is pointing at correct version

        response = client.list_object_versions(Bucket=bucket_name)
        versions = []
>       versions = response['Versions']
E       KeyError: 'Versions'

Please note that due to the empty ListBucketVersion response also the teardown() of the s3-tests framework is failing because it deletes the objects in the bucket based on the ListBucketVersion information before trying to delete the bucket. In our case, the bucket will not be emptied as the ListBucketVersion response contains nothing except the RequestCharged entry while one version of the object is still present. Thus we get a BucketNotEmpty exception:

s3tests_boto3/functional/__init__.py:303: in teardown
    nuke_prefixed_buckets(prefix=prefix)
s3tests_boto3/functional/__init__.py:159: in nuke_prefixed_buckets
    raise err
s3tests_boto3/functional/__init__.py:150: in nuke_prefixed_buckets
    nuke_bucket(client, bucket_name)
s3tests_boto3/functional/__init__.py:139: in nuke_bucket
    client.delete_bucket(Bucket=bucket)
.tox/py/lib/python3.8/site-packages/botocore/client.py:569: in _api_call
    return self._make_api_call(operation_name, kwargs)

....
        if http.status_code >= 300:
            error_info = parsed_response.get("Error", {})
            error_code = error_info.get("QueryErrorCode") or error_info.get(
                "Code"
            )
            error_class = self.exceptions.from_code(error_code)
>           raise error_class(parsed_response, operation_name)
E           botocore.exceptions.ClientError: An error occurred (BucketNotEmpty) when calling the DeleteBucket operation: The bucket you tried to delete is not empty. You must delete all versions in the bucket.