risingwavelabs / risingwave

SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
https://www.risingwave.com/slack
Apache License 2.0
6.58k stars 537 forks source link

Tracking: Implementing S3 object store via OpenDAL #14321

Open wcy-fdu opened 6 months ago

wcy-fdu commented 6 months ago

For a long time in the past, we have been using the rust version of aws s3-sdk to implement our object store. However, the frequent break change updates of the sdk have caused some stability problems. At the same time, for other object storage support, such as gcs, azblob, etc., we all use OpenDAL. In a recent PR, we introduced s3 file source through OpenDAL's s3 service and found that its performance and stability are also good.

It is worth mentioning that we currently use madism to wrap a layer of aws s3 sdk for deterministic test, however, when switching to OpenDAL s3, it's not easy to implement madsim for OpenDAL(at least it won't be easy in the short term), we can just mock the implementation of object store trait, instead of the aws s3 client or OpenDAL object. This will be much easier.

Based on the above, I think we can try to replace the existing s3 object store, all object stores are all in OpenDAL. Since S3 is our most commonly used object store and is used by many customers, this switch must be done with great caution, so I have listed a rough roadmap:

wcy-fdu commented 6 months ago

If you have any reasons or concerns about retaining the current s3, please discuss it here. cc @hzxa21 @fuyufjh

github-actions[bot] commented 3 weeks ago

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.