dapr / proposals

Proposals for new features in Dapr
Apache License 2.0
15 stars 33 forks source link

Proposal: object storage building block #18

Open ItalyPaleAle opened 1 year ago

ItalyPaleAle commented 1 year ago

Consolidating and updating the proposals for a new object storage building block: dapr/dapr#4804, dapr/dapr#4934

Proposed the creation of a new "object storage" building block which allows storing objects of unstructured data and arbitrary size, and it's optimized for reading/writing with streams.

yaron2 commented 10 months ago

@dapr/maintainers-dapr - review ping.

yaron2 commented 10 months ago

+1 binding

This is a good proposal and barring any new major issues I think it'd be good to move forward with this. Outstanding comments can be addressed as implementation details IMO.

daixiang0 commented 10 months ago

+1 binding

Object storage is used widely, I really wish Dapr support it. As @yaron2 said, Outstanding comments can be addressed as implementation details IMO.

WhitWaldo commented 4 months ago

Rather than putting this under the kitchen sink of the existing state management building block, I propose that this instead should be part of a new and next-generation suite of purpose-built state management building blocks.

I propose that we leave the existing building block where it is as it's a dependency for several other blocks, but split this out to be a completely separate thing conceptually, leaving the door open for a collection of other specialty-state stores to be built alongside it later.

Why a clean break? The existing key value store is confusing to new developers and quite arguably incomplete as a full-featured key/value store (e.g. no key query support). Discord is filled with people inquiring about the (discontinued, but still present and very limited) query API and every now and then someone asks about their large files mysteriously not saving because they're exceeding the request limits. It's difficult to add any new functionality to today's state store because it already supports so many storage providers that themselves are so distinctly purpose built. For developers that want that need something today, it's a fine option, but I propose that we draw a line here on the first generation state store as a whole and instead adopt a more specific second generation model going forward. Object stores don't need querying or transactions and they'll never be a primary store for actors - they're distinctly different from what we've already got, I think this is the opportune time to start fresh.

Finally, I propose that this start with the core Get/Set streaming and Delete functionality and have a pile of later optional interfaces come along that provides search/filter functionality against the metadata for those providers that support this or event notifications (potentially paired with pubsub) or lifecycle management. We can save a more blob-specific state storage (with its own purpose-built component providers like append-only blob storage or page blobs) for another purpose-built implementation with its own distinct requirements.