storj / roadmap

Storj Public Roadmap
Other
9 stars 3 forks source link

S3 Object Lock: Compliance Mode #47

Open ferristocrat opened 1 year ago

ferristocrat commented 1 year ago

Description:

Implement S3 compatible object lock and retention features in Storj. This includes the ability to get and put object lock configurations, get and put object retention settings, and get and put object legal hold status.

What is the problem/pain point?

Currently, Storj does not support object lock and retention features that are compatible with S3. This means that users migrating from S3 to Storj may miss out on these features, which are important for data governance and compliance.

What is the impact?

Implementing these features will enhance Storj's compatibility with S3, making it easier for users to migrate from S3 to Storj. It will also improve Storj's data governance capabilities, as users will be able to apply retention policies and legal holds to their objects.

Why now?

As more and more organizations are looking for cost-effective and secure alternatives to S3, it's important for Storj to offer feature parity with S3. Implementing these features now will help Storj attract more users and meet their data governance needs.

Links:

Milestone: https://github.com/storj/edge/milestone/28

New S3 Actions Supported:

Action API Description Description of Change(s)
GetObjectLockConfiguration Gets the object lock configuration for a bucket. Will return the ObjectLockConfiguration with ObjectLockEnabled either as Enabled or empty. Rule will not be included as a response element as specifying a bucket-level object Lock rule is initially out of scope.
PutObjectRetention Places an object retention configuration on an object.
The only value supported for Mode is COMPLIANCE as Governance Mode is initially out of scope.
GetObjectRetention Retrieves an object's retention settings.

Existing S3 Actions Updated

Method API Description Description of Change(s)
CreateBucket Creates a new bucket. CreateBucket will now accept the following request parameter:
- x-amz-bucket-object-lock-enabled
HeadObject Retrieves metadata from an object without returning the object itself. HeadObject will now return:
- Mode (only Compliance is supported initially) that is currently in place for the requested object
- Date/time that the object's lock will expire
GetObject Retrieves an object from a bucket. GetObject will now return:
- Mode (only Compliance is supported initially) that is currently in place for the requested object
- Date/time that the object's lock will expire
PutObject Adds an object to a bucket. PutObject will now:
- Prevent locked object versions from being overwritten

PutObject will now accept the following request parameters:
- x-amz-object-lock-mode (only Compliance is supported initially)
- x-amz-object-lock-retain-until-date
CopyObject Creates a copy of an object that is already stored on Storj. CopyObject will now accept the following request parameters:
- x-amz-object-lock-mode (only Compliance is supported initially)
- x-amz-object-lock-retain-until-date
CreateMultipartUpload This action initiates a multipart upload and returns an upload ID. CreateMultipartUpload will now accept the following request parameters:
- x-amz-object-lock-mode (only Compliance is supported initially)
- x-amz-object-lock-retain-until-date

Storj has a unique object level TTL. Any request that has both a TTL and a retention period will be rejected to prevent TTL's from conflicting with object lock retention periods.
DeleteBucket Deletes the specified bucket. Forced deletion of a bucket with locked objects will be prevented.
DeleteObject Removes an object from a bucket. Deletion of an object with a retention set will be prevented.

In addition to the new and updated actions supported above, we have a follow on roadmap item to implement the remaining scope of S3 Compatible Lock, mainly the addition of Governance Mode and Legal hold. These additional actions are outlined in the roadmap item here: https://github.com/storj/roadmap/issues/98

kaloyan-raev commented 6 months ago

An excellent read to understand Object Lock is https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock.html

A key moment in the design and implementation is to figure out how to map the S3 Object Lock permissions to the Storj permissions model:

kaloyan-raev commented 6 months ago

The Veeam documentation describes how they use Object Lock: https://helpcenter.veeam.com/docs/backup/vsphere/os_immutability_limitations.html?ver=120#amazon-s3-immutability-limitations-

Notable quotes:

After you have created an S3 bucket with Object Lock enabled, check that the default retention is disabled.

The default retention may result in an unpredictable system behavior and data loss. However, note that Veeam Backup & Replication will use Compliance object lock mode for each uploaded object.

Versioning and Object Lock must NOT be enabled or disabled on buckets that have been added to Veeam Backup & Replication as it may lead to unpredictable system behavior and data loss.

All this means that:

This significantly reduces the scope of work if we initially target only the Veeam use case.

We need to support these S3 methods:

We should also add a new ObjectLock permission in addition to the existing Read, Write, List, and Delete and allow the above methods only if such permission is granted in the S3 credentials.

ferristocrat commented 6 months ago

We should also add a new ObjectLock permission in addition to the existing Read, Write, List, and Delete and allow the above methods only if such permission is granted in the S3 credentials.

ferristocrat commented 6 months ago

We should also add a new ObjectLock permission in addition to the existing Read, Write, List, and Delete and allow the above methods only if such permission is granted in the S3 credentials.

I could imagine scenarios where it'd be useful to read the object retention or config without being able to change them... would we use a combination of say ObjectLock and Read/List/Write?

Also, don't we still need PutObjectLockConfiguation?

kaloyan-raev commented 6 months ago

I could imagine scenarios where it'd be useful to read the object retention or config without being able to change them... would we use a combination of say ObjectLock and Read/List/Write?

It's worth discussing our approach to this. AWS S3 has a fine-grained permission model. They have >100 permissions, a separate permission for almost every S3 method.

Storj has a more coarse-grained permission model. We have a handful of permissions for a group of actions. For example, the Write permission covers S3 methods like AbortMultipartUpload, CreateBucket, PutObject, etc., instead of having a separate permission for each of them. In the future, we could split the Storj permissions groups (Read, Write, ...) into more fine-grained permissions to be closer to the S3 permission model.

Therefore, initially, adding a new ObjectLock permission that covers all related S3 methods would be enough. If we need a more fine-grained control we could split it to separate permissions for each S3 method. Combining ObjectLock with Read/List/Write may lead to backward compatibility issues when we decide to switch to such a fine-grained permission model.

Also, don't we still need PutObjectLockConfiguation?

PutObjectLockConfiguation enables object lock on an existing bucket or changes the default retention of the bucket. Veeam does not need either of these. Initially, we could support enabling object lock during bucket creation, which the CreateBucket method covers.

ferristocrat commented 4 months ago

Sprint 35 - Working on finalizing technical design this sprint