opensearch-project / opensearch-api-specification

API specification for OpenSearch
Apache License 2.0
34 stars 62 forks source link

[PROPOSAL] Core <> Spec/Protobuf Automation #653

Open amberzsy opened 3 weeks ago

amberzsy commented 3 weeks ago

What/Why

What are you proposing?

There are bunch gaps for having E2E protobuf and API spec release:

  1. Open api spec not in sync with core API and lack of automation and enforcement
  2. Lack of validation API spec
  3. Unclear on Source of truth

Proposal

Source of truth

Development flow

  1. The protos will reside in the same repository as the current API specs: opensearch-api-specification.

Snip20241114_2

  1. The script to convert from specs to protobufs will also live in the opensearch-api-specification repository, but it will make calls to other repositories (for example the fork of opensearch-project/openapi-generator).
  2. The opensearch-api-specification will be imported as a git submodule to the opensearch repo, where the Protobufs will need to be used.
  3. If adding a new GRPC endpoint that is already implemented in the HTTP protocol:
  1. On a PR to the opensearch repo, developer is required to provide the branchname that their spec changes are on (in the opensearch-api-specification repo).
  2. Then there would new CI jobs running to:
  1. To merge the PR, instead of clicking merge on Github, instead a maintainer would comment the command “bot merge”.

Snip20241114_5

dblock commented 2 weeks ago

Thanks for opening this.

I think we should resolve what the source of truth for OpenSearch would be with this proposal. My understanding was (still is :)) that the spec in this repo is the OpenSearch API truth. While today it's reverse engineering the existing OpenSearch API because of the legacy, we see a future where the OpenSearch server-side API is generated from spec as well. If this is true, then protobufs are just another output from the source of truth, aka a build of API spec in this repo that outputs .proto files.

WDYT?

I opened https://github.com/opensearch-project/opensearch-api-specification/pull/655 to express this.

amberzsy commented 2 weeks ago

Thanks for opening this.

I think we should resolve what the source of truth for OpenSearch would be with this proposal. My understanding was (still is :)) that the spec in this repo is the OpenSearch API truth. While today it's reverse engineering the existing OpenSearch API because of the legacy, we see a future where the OpenSearch server-side API is generated from spec as well. If this is true, then protobufs are just another output from the source of truth, aka a build of API spec in this repo that outputs .proto files.

WDYT?

I opened #655 to express this.

open-api-spec in general is for REST/JSON, that's why propose have separate repo for proto. but not strong opinionated on this. i think we can start with keeping in same repo. But we still need to close the gap that, for protobuf, server side development has to define the protobuf first, which is unlike json, which we can do reverse engineering. That's why introduce the development flow above.

reg if spec the source of truth. imo, partially yes. for API schema, most likely, but for rpc etc, it would be different. another example would be, grpc response is very different from http. so most of response we cannot adopt or directly convert from spec. Or, .... we can make spec can support both protobuf and json with certain limitation/restriction etc.

Also, actually, let me break into separate proposals.

  1. proposal on how we can keep core <> spec in sync. (how and plan to fix the existing and stop gap).
  2. proposal on the protobuf conversion. (include tooling and the rules we build).
  3. proposal on protobuf release/development cycle. (maybe can merge into 1. )
dblock commented 2 weeks ago

I suppose I am still stuck in the idea that the spec is the only API, but it sounds like you want the Protobuf spec to be potentially very different? Let's think from a developer POV (related to 1), if I am adding a new API, say "_refresh", how do you see one do it in a world where it's both a REST API and a protobuf API?

guptashubham commented 1 week ago

i agree with DB that spec is the only source of truth. However, spec needs to evolve as it will require versioning support (with the binary) among other things. I'd also expect spec to be the source of truth for API documentation and we can build it over time.

amberzsy commented 1 week ago

updated the proposal.

I suppose I am still stuck in the idea that the spec is the only API, but it sounds like you want the Protobuf spec to be potentially very different?

per our offline discussion. the Spec would be the SoT. All clients and Probobuf would be generated from Specs. for grpc response, we would do some loose mapping based on https://grpc.io/docs/guides/status-codes/. (so far looks doable. )

Let's think from a developer POV (related to 1), if I am adding a new API, say "_refresh", how do you see one do it in a world where it's both a REST API and a protobuf API?

mentioned above. basically, develop would first update the spec with new API definition, certain scrip would auto generate the protobuf definition. In the PR process, if prefer have http server impl first, it can configure with HTTP only. so it would still generate the proto but have annotation with "grpc unsupported". same process for grpc impl only.