quickwit-oss / quickwit

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
https://quickwit.io
Other
8.24k stars 336 forks source link

[quick-storage] Support more storage via opendal #2655

Open DCjanus opened 1 year ago

DCjanus commented 1 year ago

With trait Storage, quickwit supports more than one storage, but there is a lot of redundant work in the middle. Perhaps we can try to use OpenDAL, with the community's efforts, to support more storage backends for quickwit for free.

OpenDAL is a Rust library that wraps different storage into a unified interface, and is already used in projects such as sccache.

Of course, I don't think we should use the opendal interface directly in quickwit, but continue to insist on the Storage trait, so that we can expand more easily in the future.

I am willing to provide a simple PR to explain this idea, provided that you are interested in it.

DCjanus commented 1 year ago

For example, in China, large and medium-sized companies tend to use their own storage facilities instead of cloud vendors' storage services. There are not many stable open source object storage implementations at present.

For companies that do not have enough technical strength, people tend to use HDFS to store data. Supporting HDFS in Rust is a relatively difficult thing. Fortunately, OpenDAL has already provided support for HDFS.

With OpenDAL, quickwit can work together with HDFS.

fmassot commented 1 year ago

@DCjanus I looked at OpenDAL and it seems a very good idea to use it to add new storage backends.

I am willing to provide a simple PR to explain this idea, provided that you are interested in it.

Thanks! We will be happy to review it, I suggest starting with adding HDFS support to Quickwit, what do you think?

Xuanwo commented 1 year ago

Hello, I am the maintainer of OpenDAL, and I am very happy that quickwit wants to try OpenDAL!

After a quick review of the Storage trait, I am sure that all APIs are already supported.

And I recommend to start with webhdfs, it's much easier to integrate and test locally.