apache / opendal

Apache OpenDAL: One Layer, All Storage.
https://opendal.apache.org
Apache License 2.0
3.45k stars 482 forks source link

feat(core): add version(bool) for List operation to include version d… #5106

Closed meteorgan closed 2 months ago

meteorgan commented 2 months ago

Which issue does this PR close?

part of https://github.com/apache/opendal/issues/2611.

Rationale for this change

What changes are included in this PR?

  1. add version(bool) for List operation
  2. support version for s3 List

Are there any user-facing changes?

meteorgan commented 2 months ago

There are a few issues to discuss:

  1. since we can retrieve the bucket versioning status using GetBucketVersioning API, should we use this API to set bucket_versioning_enabled, and avoid requiring users to pass this parameter manually ? The downside is that doing so would require call the API in Builder which might negatively affect users who don't need versioning. Additionally, the GetBucketVersioning API requires bucket owner permissions, while ListObjectVersions only requires READ access to the bucket.
meteorgan commented 2 months ago
  1. Because list and list with version use different APIs in S3, should we add new Github actions for S3, such as aws_s3_with_version ? if we only use a bucket with object versioning enabled to test all cases, we might unintentionally skip the tests for ListObjectsV2
meteorgan commented 2 months ago
  1. Since ListObjectVersions doesn't has start_after parameter, but instead uses key-marker and version-id-marker. should we set key-marker to start_after ? However, according to the document, key-marker: Specifies the key to start with when listing objects in a bucket, it seems to be inclusive. In contrast, the start-after parameter in ListObjectsV2 is exclusive.
meteorgan commented 2 months ago
  1. Since ListObjectVersions doesn't has start_after parameter, but instead uses key-marker and version-id-marker. should we set key-marker to start_after ? However, according to the document, key-marker: Specifies the key to start with when listing objects in a bucket, it seems to be inclusive. In contrast, the start-after parameter in ListObjectsV2 is exclusive.

But I've found some sources suggesting that key-marker is actually exclusive(for example: this discussion). I'll verify it later to confirm.

Xuanwo commented 2 months ago
  1. since we can retrieve the bucket versioning status using GetBucketVersioning API, should we use this API to set bucket_versioning_enabled, and avoid requiring users to pass this parameter manually ? The downside is that doing so would require call the API in Builder which might negatively affect users who don't need versioning. Additionally, the GetBucketVersioning API requires bucket owner permissions, while ListObjectVersions only requires READ access to the bucket.

We are not permitted to call the async API during the build stage, so we can't call the API there. I prefer to have an enable_version option for the user to set. Only after users set enable_version will we enable the related capability. This also addresses testing concerns: we can add a new test action with OPENDA_S3_ENABLE_VERSION.

3. Since ListObjectVersions doesn't has start_after parameter, but instead uses key-marker and version-id-marker. should we set key-marker to start_after ?

Yes, I believe we can use start_with as users key-marker.

meteorgan commented 2 months ago

Ceph differs from S3 in its handling of the key-marker parameter. in Ceph, it's inclusive, which is why this check is failing. How should we address this issue ? @Xuanwo

Xuanwo commented 2 months ago

Ceph differs from S3 in its handling of the key-marker parameter. in Ceph, it's inclusive, which is why this check is failing. How should we address this issue ? @Xuanwo

Would you like to submit an issue for rados? As for now, we can remove the versioning test for rados first.

meteorgan commented 2 months ago

Ceph differs from S3 in its handling of the key-marker parameter. in Ceph, it's inclusive, which is why this check is failing. How should we address this issue ? @Xuanwo

Would you like to submit an issue for rados? As for now, we can remove the versioning test for rados first.

https://tracker.ceph.com/issues/68055 (The Issues of Ceph repository is not enabled)

Xuanwo commented 2 months ago

tracker.ceph.com/issues/68055 (The Issues of Ceph repository is not enabled)

We can create an issue on the OpenDAL side to keep track of it.

meteorgan commented 2 months ago

tracker.ceph.com/issues/68055 (The Issues of Ceph repository is not enabled)

We can create an issue on the OpenDAL side to keep track of it.

Ok, Let me do it later

meteorgan commented 2 months ago

Thanks a lot for your work, @meteorgan!

There's one thing to note: I haven't added versioning behavior test for AWS. it seems like only you can do that ?

Xuanwo commented 2 months ago

There's one thing to note: I haven't added versioning behavior test for AWS. it seems like only you can do that ?

I can enable the version for bucket, would you like to add a new workflow for it?

meteorgan commented 2 months ago

There's one thing to note: I haven't added versioning behavior test for AWS. it seems like only you can do that ?

I can enable the version for bucket, would you like to add a new workflow for it?

Ok. I'll add it later