coinbase / chainstorage

The File System For a Multi-Blockchain World
https://time-wilderness-a70.notion.site/ChainStorage-5c173d1cafd842ceb9e38c26bfbd6265?pvs=4
Apache License 2.0
69 stars 24 forks source link

Master Worklog: Provide storage abstraction of ChainStorage #44

Closed leozc closed 4 months ago

leozc commented 9 months ago

This is a large issue that may need to be broken down further:

Currently, ChainStorage is bound to AWS dependency

  1. S3 as Blob storage
  2. DynamoDB as Key-Value storage
  3. SQS as dead letter queue

Why? In order to help ChainStorage be more portable, we need to break down these hard-wired dependencies. A possible solution is to provide abstract interfaces for these storage solutions and implement the adaptor to different cloud providers accordingly.

We also explored a driver-level compatibility layer (e.g., CHainStorage continue to use S3 library interfaces, and we adopt the S3 driver to different cloud provider solution) - but it is a no-go due to complexity.

Tickets so far:

PS We should have a local implementation - Blob Storage -> File, SQL (lite?) -> KV, and a simple table for DLQ, for testing and possibly local production use cases.

bestmike007 commented 9 months ago

I created this draft PR: https://github.com/coinbase/chainstorage/pull/43

If it's on the correct direction, I'll continue working on it.

jiezhang commented 9 months ago

For local implementation we are using AWS's localstack.

leozc commented 8 months ago

@jiezhang, localstack works for now - but here are my concerns

  1. Localstack is an abstraction of AWS cloud, if we have lower level abstraction is usually better and simpler
  2. There would be use cases for production local storage (No-Cloud usage), would simpler abstraction better?
jiezhang commented 8 months ago

@leozc For local run using GCP storage backends, looks like we may use their emulators? https://cloud.google.com/sdk/gcloud/reference/emulators https://cloud.google.com/sdk/gcloud/reference/beta/emulators

bestmike007 commented 8 months ago

The emulators do not include BigTable, unless we want to use firestore.

And I think the local implementation @leozc mentioned was not only for local development, but also a solution to baremetal deployment or using private cloud without similar products.

jiezhang commented 8 months ago

It's available in the beta emulators: https://cloud.google.com/sdk/gcloud/reference/beta/emulators

jiezhang commented 8 months ago

Don't get me wrong. I'm NOT against this idea of building a storage abstraction. But we should do it step by step. Initially we should focus on building the GCP abstraction and leveraging its emulators for local runs and integration tests.

bestmike007 commented 8 months ago

It's available in the beta emulators: https://cloud.google.com/sdk/gcloud/reference/beta/emulators

This is great, I was finding an alternative to BigTable for integration tests, turns out they are already supporting it.

leozc commented 4 months ago

@bestmike007 we can consider this as done?

bestmike007 commented 4 months ago

Yes, I think so.