apache / opendal

Apache OpenDAL: access data freely.
https://opendal.apache.org
Apache License 2.0
3.42k stars 477 forks source link

idea: Introduce Codegen into OpenDAL #4279

Open Xuanwo opened 8 months ago

Xuanwo commented 8 months ago

The main obstacles preventing our users from trying out bindings are the documentation, including API docs, examples, and more.

However, OpenDAL now supports over 50 services, with more being added continuously. Our bindings have also grown, now totaling 14. It's neither possible nor wise to attempt filling this gap manually. We have discussed about this many times, for example, https://github.com/apache/opendal/issues/3537.

So I think it's time for us to introduce codegen to help us:

I have build a demo at https://github.com/apache/opendal/pull/4278.

During building the demo, I found it's useful even just for rust:

So what do you think about this?

Xuanwo commented 8 months ago

cc binding owners:

Other bindings are not released yet, let's keep the scope small for now.

tisonkun commented 8 months ago

I don't find a n * m complexity here, but the binding should wrap APIs. Once APIs are wrapped, all services can be used following the same configuration pattern.

Xuanwo commented 8 months ago

Once APIs are wrapped, all services can be used following the same configuration pattern.

Are you talking about exposing S3Config?

tisonkun commented 8 months ago

If you mean a structural config type, I agree. Using IDL protobuf or avro may help?

tisonkun commented 8 months ago

Are you talking about exposing S3Config?

Currently, we can configure service with a nonstructural map.

Xuanwo commented 8 months ago

Using IDL protobuf or avro may help? Or try use jsonnet.

How will we expose API to users in this way? Use the struct that generated by protobuf directly?

tisonkun commented 8 months ago

Yes. I'd prefer to avoid defining a handy DSL, which is likely to be error-prone. But instead using a battle-tested schemata solution like Avro - https://avro.apache.org/docs/1.11.1/idl-language.

Protobuf can be hard to use in the Java world if we don't need the serde functions.

Xuanwo commented 8 months ago

Our discussion seems to be going to far regarding config structure generation's detail. That's not the main issue I want to address. Let me explain my motivation and goal first.


OpenDAL's vision is to freely access data. Our rust core enables this for the rust language, and our language bindings extend this capability across various languages for all users.

Our language is designed to mirror native implementations, allowing users to use our bindings independently without consulting Rust documentation or having prior knowledge of Rust core.

However, it's currently impossible. Take java binding as an example.

OpenDAL Java features automatically generated JavaDoc available at javadoc.io, but the documentation lacks a lot of information I wish it included:

I hope users can install the opendal-java package and learn to use it in their IDE by using ctrl+click to access our API documentation, similar to what we do for Rust core.

AJIOB commented 8 months ago

@Xuanwo, I think it will be more convenient to create code generation via rustc itself: just use the rust macros, similar to Serialize from serde, for example. It will remove useless python dependency for code compiling.

Or it is not possible for this case?

Xuanwo commented 2 months ago

just use the rust macros

Hi, @AJIOB. How can we achieve the goal of making it available to both Python and Java? I don't think it's currently possible. I mean, it's not just about the code itself, but also the documentation and API comments.