linagora / james-project

Mirror of Apache James Project
Apache License 2.0
72 stars 62 forks source link

[EPIC] S3 multi-tenancy #5263

Open chibenwa opened 2 months ago

chibenwa commented 2 months ago

Why?

Multitenancy is today hard coded for the PG implementation as distinct buckets.

Two concerns here:

That way blbostore could implement different isolation strategies for tenants (configurable):

Note that AES SSE-C isollation strategy cannot be applied with deduplication as several tenants might store the same blob and override each other keys.

How?

Refactor existing API

Refactor API of the blobstore:

Create a new pojo record Tenant(String name) Create a new pojo record Bucket(BucketName name, Optional<Tenant> tenant) Add methds for BlobStore and BlobStoreDAO passing Bucket and BlobId), provide default methods for Bucketname supplying a Bucket with no tenant.

Then each blobStore can implement the isolation it wishes - or not!

Memory blobStore DAO multitenancy

Derive a bucketname per tenant within internal storage.

S3

Configuration:

multi-tenancy.mode=none|bucket|ssec|prefix

Definition of done:

bucket

Derive a bucketname per tenant within internal storage. (IE what PG does but done within S3BlobStoreDAO)

GC is likely broken and shall be tested with this mode...

ssec

Feed the sse c salt with the tenant.

Should fail with deduplicating blobStore.

prefix

Derive the object key within S3 adding the prefix as needed

This interact with the GC!!!. We shall make sure the GC, when listing only takes the last part of the s3Key IE given prefix/ABC the GC only uses ABC as a blobID.

file

Derive a folder per tenant.

Test GC with this too.

PGSQL

Derive a bucketname per tenant within internal storage. (IE what PG does but done within PostgresBlobStoreDAO)

Test GC with this too.

Cassandra

Tenant isolation strategies do not make sense here...

quantranhong1999 commented 2 months ago

Reminder: Quan will write the ticket for OpenSearch multi-tenancy The idea: optional conf to inject domain into documents + inject a filter on each searches. CF https://github.com/linagora/james-project/issues/5263

chibenwa commented 2 months ago

After a discussion with Patrick,

ssec Should fail with deduplicating blobStore.

Is only true when deduplication is perfomed across tenants

However once deduplication is limited in scope (to one tenant), enforcement of multitenancy isolation through the use of SSE-C can be achieved.

SO multi-tenancy enforcement through the use of PREFIX and SSE-C makes sense.

This is also likely very desirable as encryption with tenant specific keys brings more trust.

Arsnael commented 2 months ago

Team: prefix and file are technically the same no?

prefix/abc technically prefix/ would be like a folder?

chibenwa commented 2 months ago

prefix/abc technically prefix/ would be like a folder?

That's how it looks like but folders do not exist in S3, S3 only support arbitrary prefixes. Prefix can be used to kinda emulate folders.

Arsnael commented 2 months ago

Task list (feel free to comment)