Xtansia commented 1 year ago

What/Why

What are you proposing?

Performing an experiment using the WIP Smithy API specification for OpenSearch to generate missing functionality in the client for a small API without re-creating the entire client.

What users have asked for this feature?

What problems are you trying to solve?

Reduce friction in adoption of new OpenSearch APIs & features in the language clients.
Reduce maintenance overhead

What is the developer experience going to be?

A script/tool that can be run to automatically fetch the API spec and generate the chosen code.

Are there any security considerations?

n/a

Are there any breaking changes to the API

This experiment is only successful if it does not break compatibility.

What is the user experience going to be?

Same as developer experience.

Are there breaking changes to the User Experience?

See above.

Why should it be built? Any reason not to?

The learnings from this experiment, if successful:

Will be broadly applicable to all language clients.
Will reduce workloads for maintainers.
Will speed up adoption of new APIs in clients.
Will fill in gaps in API coverage If the experiment is unsuccessful, it will not be merged, negating almost any potential downsides. It may not be feasible to integrate generated code from an off-the-shelf generator with the pre-existing client code.

What will it take to execute?

Will use the /_remotestore/_restore API as a low-risk low-touch use-case
Potentially configuring or customising an off-the-shelf generator, at worst hand-rolling a minimal generator for our use-case.

Xtansia commented 1 year ago

Options Explored

Pre-existing OpenAPI generator

A quick inspection of the existing Java generator for openapi-generator shows it is generally only workable when generating completely from scratch, as it expects to be generating a whole client, project files and all. There is some flexibility within the CLI tool to reduce surface area of what you desire to generate, however not to level necessary. All the generated code is also specific to whatever HTTP & JSON libraries have had templates written for.

Verdict: 👎 Any generated code would be completely independent of the existing client which is not desirable.

Custom OpenAPI templates/generator

The OpenAPI generator tool does allow for customising templates of an existing generator, or writing a whole new one. This would allow for being able to generate code that correctly integrates with the custom JSON and transport code used by the existing client. However an architectural limitation of the custom generators means it's only possible to generate files per service (ie. client class) or per model, not per API operation. Meaning it's unable to generate the "Request" classes like currently exist in the client that encapsulate body, query params & endpoint in one class. (This may be less of an issue in other languages where a file maps to a module containing multiple types rather than a single type/class)

Verdict: 👎 Generated code could be inter-operable with existing client, however would have an entirely different calling convention, making it unsuitable as we want to work towards regenerating the entire client with minimal breaking changes.

Custom Smithy Build plugin/generator

Smithy Build allows building plugins (written in Java/Kotlin) that are passed the Smithy model and allow processing it in whatever manner you desire, in a way that is orders of magnitude more flexible than the OpenAPI generator approach. Due to the ability to process the model from the start with custom code it is relatively easy to generate code in the same structure as the existing client. As we'd be dealing with the Smithy model directly it'd be possible to implement and handle custom Smithy traits if desired. Unfortunately the means by which to create these plugins is not documented, and so required studying the smithy-kotlin & smithy-rs plugins to understand how it works. The requirement of using Java/Kotlin will be fine here, but may be problematic for other language clients.

Verdict: 👍 (mostly) Will allow us to generate (near-)identical code to the rest of the client, without having to deal with parsing and manipulating the model from scratch. Comes at cost in onboarding new contributors due to lack of documentation.

Unsolved Issues

Linking the chain together of taking the Smithy spec from the external repo (at some ref/tag) running Smithy build with the custom plugin, then committing the generated code. Preferably as some form of CI that opens a PR with the updated generated code.
A clean manner to define the narrow slice we want to generate, until we come to the point of regenerating the whole client.
Variations of certain APIs that are treated as the same operation in current client (e.g /_search & /{index}/_search), are modelled as separate operations in Smithy. This would cause issues when wanting to replace these APIs with generated code. Might be able to work around with some transforms or a custom trait?

Xtansia commented 1 year ago

I created a draft PR with my current code for the Smithy build plugin generator #318

dblock commented 1 year ago

Good start!

Dumb question, does it have to be Kotlin whereas everything else is Java?
Is there really no "out-of-the-box-thirdparty-generator" that can get us far enough?
What does this code generate? without me running it :)
To answer (3) I think you want to hook up the generator to generate some code that can replace existing java client code (when we see a bunch of deleted .java files we can say that this much was replaced by the generator).

Xtansia commented 1 year ago

Good start!

Dumb question, does it have to be Kotlin whereas everything else is Java?

No it doesn't have to be Kotlin, it could be written in Java. Kotlin just has the advantage of being less verbose, and the other plugins I was researching off of were in Kotlin so made it easier.

Is there really no "out-of-the-box-thirdparty-generator" that can get us far enough?

None that I've found, Smithy has no Java generator, and all the OpenAPI generators expect to be generator from scratch and naturally don't know to create the specialised output desired for the client. We could get functional code, but it'd never be able to replace the current code.

What does this code generate? without me running it :)

To answer (3) I think you want to hook up the generator to generate some code that can replace existing java client code (when we see a bunch of deleted .java files we can say that this much was replaced by the generator).

Yeah the actual getting it to run inside this repo hasn't yet been done yet, as it's a bit tricky with the spec living in a different repo. And I've been experimenting on the remote store api which doesn't yet exist in the client, so not replacing anything existing. This is the spec & config I've been experimenting with so far https://github.com/opensearch-project/opensearch-api-specification/compare/main...Xtansia:opensearch-api-specification:experiment/codegen and this is the output for that config https://gist.github.com/Xtansia/f3eaaf8aeb1121299484e9e8a3513700. Once I get some proper semblance of it running contained within the repo, then I'll update the PR so that it includes the generated output.

dblock commented 1 year ago

Thanks @Xtansia! Looks like great progress. Add license headers to those generated files, and a header that says that the file was generated ;)

reta commented 1 year ago

Thanks @Xtansia, few clarifications from my side:

Is there really no "out-of-the-box-thirdparty-generator" that can get us far enough?

It seems like one of the advertised Smithy features is the ability to generate the OpenAPI spec which could be fed to openapi-generator to generate a client for a wide variety of the programming languages and frameworks

A quick inspection of the existing Java generator for openapi-generator shows it is generally only workable when generating completely from scratch, as it expects to be generating a whole client, project files and all.

This is not true: the generator is highly configurable in what and how should be generated. It could generate only models, models etc (used it in the past and 100% sure of that).

Smithy vs OpenAPI

@dblock @wbeckler This is personal perspective: there are many specs like Smithy, backed by different vendors. The OpenAPI is vendor neutral, widely supported / accepted / understood specification, which seems to be a reasonably good way to share contracts for web APIs. The OpenAPI v3.1 brought a full alignment with JSON Schema draft 2020-12, this is big deal in a way how large pieces (models) could be described and referred to. I think it would be very beneficial to use OpenAPI spec as a basis for client code generation, just my 5 cents

Xtansia commented 1 year ago

Thanks @Xtansia, few clarifications from my side:

Is there really no "out-of-the-box-thirdparty-generator" that can get us far enough?

It seems like one of the advertised Smithy features is the ability to generate the OpenAPI spec which could be fed to openapi-generator to generate a client for a wide variety of the programming languages and frameworks

You are correct and that's the foundation of how I tested with openapi-generator, all of these experimentations use the Smithy spec as its starting point.

A quick inspection of the existing Java generator for openapi-generator shows it is generally only workable when generating completely from scratch, as it expects to be generating a whole client, project files and all.

This is not true: the generator is highly configurable in what and how should be generated. It could generate only models, models etc (used it in the past and 100% sure of that).

That's correct, and is what I was hinting at by "There is some flexibility within the CLI tool to reduce surface area of what you desire to generate, however not to level necessary.", I was unable to find quite the level of filtering I desired, as I wanted more advanced filtering than "only models" but potentially I missed something and did it a disservice. However we then reach the point where regardless of the filtering, any code generated by the off-the shelf openapi-generator is never going to match up with the existing codebase of the client, and then if we do custom templates with openapi-generator the architectural limitations mean we'd be still be unable to match up with the existing client.

Smithy vs OpenAPI

@dblock @wbeckler This is personal perspective: there are many specs like Smithy, backed by different vendors. The OpenAPI is vendor neutral, widely supported / accepted / understood specification, which seems to be a reasonably good way to share contracts for web APIs. The OpenAPI v3.1 brought a full alignment with JSON Schema draft 2020-12, this is big deal in a way how large pieces (models) could be described and referred to. I think it would be very beneficial to use OpenAPI spec as a basis for client code generation, just my 5 cents

I believe there was an issue somewhere to discuss the benefits of each at some point, I'm unsure where though. It would be possible to generate from OpenAPI spec, but as far as I can tell to get code that matches the existing Java client would mean doing it from near-scratch.

reta commented 1 year ago

I think you mean this one, https://github.com/opensearch-project/opensearch-api-specification/issues/30 ?

Just recently we have had the discussion with @wbeckler regarding Smithy w/o OpenAPI, and one of the ideas we where thinking about is to have POCs to compare both approaches. I believe yours is the great one to showcase Smithy case.

Xtansia commented 1 year ago

For reference this is as far as I got with the custom openapi-generator templates/generator before hitting the fact I could not generate code in the same structure as the existing client: https://github.com/opensearch-project/opensearch-java/compare/main...Xtansia:opensearch-java:experiment/codegen-openapi

VachaShah commented 1 year ago

@Xtansia This is awesome, thank you! For @reta's question, is it also possible to post what gets generated (maybe a snippet of a class) from the custom Smithy build plugin vs the custom open-api generator? This would give more insights into what kind of client is generated for each option.

Xtansia commented 1 year ago

@VachaShah @reta Here's the example output for that openapi-generator, https://gist.github.com/Xtansia/b88ad73bde610a5e677b0b293c6155ea, while it's obviously not complete functional code, the main thing to notice is how the models only represent the body of the request/response, and the query params are shifted to the operation method and "flattened" (ie. if there's n query params then the method will have n+ params)

dblock commented 1 year ago

@VachaShah @reta Here's the example output for that openapi-generator, https://gist.github.com/Xtansia/b88ad73bde610a5e677b0b293c6155ea, while it's obviously not complete functional code, the main thing to notice is how the models only represent the body of the request/response, and the query params are shifted to the operation method and "flattened" (ie. if there's n query params then the method will have n+ params)

Mustache templates seems way easier than kotlin code, isn't it?

Xtansia commented 1 year ago

@VachaShah @reta Here's the example output for that openapi-generator, https://gist.github.com/Xtansia/b88ad73bde610a5e677b0b293c6155ea, while it's obviously not complete functional code, the main thing to notice is how the models only represent the body of the request/response, and the query params are shifted to the operation method and "flattened" (ie. if there's n query params then the method will have n+ params)

Mustache templates seems way easier than kotlin code, isn't it?

That's entirely subjective, and depends on the complexity of the template. I personally find code generally easier to reason about. But it's not really templates vs code, you could go to the effort of bringing mustache into the smithy plugin if you wanted to. The problem is entirely the generated code compatibility with the openapi-generator.

reta commented 1 year ago

For reference this is as far as I got with the custom openapi-generator templates/generator before hitting the fact I could not generate code in the same structure as the existing client: main...Xtansia:opensearch-java:experiment/codegen-openapi

@Xtansia thanks for sharing, I picked up your work here https://github.com/opensearch-project/opensearch-java/pull/336, there are no generator changes at the moment (no need, templates are just enough) but I hope to finalize the PoC by getting as close to exiting generated model as possible.

Xtansia commented 1 year ago

Thanks for putting together that more complete PoC for OpenAPI @reta! It has been really helpful.

I wanted to distil a summary of where we're currently at with regard to the pros & cons etc. I've tried to be even-handed, but it's possible I've missed some points, so do let me know if you've got anything to add or correct and I'll make sure to update it.

Explored Approaches / Proof of Concepts

Smithy Build Plugin

PoC PR: https://github.com/opensearch-project/opensearch-java/pull/318

Pros

Smithy model is slightly more friendly towards the operation-centric nature of the Java client.
More flexibility in manipulating model & output using code.
Ability to generate completely identical code to existing client.

Cons

Smithy is not widely adopted or popular.
Requires writing & navigating code to understand & modify output.
Only usable as long as specs are written in Smithy.

Off-the-shelf OpenAPI Generator With Customised Templates

PoC PR: https://github.com/opensearch-project/opensearch-java/pull/336

Pros

OpenAPI is the de facto standard for API schema definitions.
Clearer mapping from generator to output, as no code, just Mustache templates.
Usable regardless of if specs are written in Smithy or OpenAPI directly as Smithy is convertible to OpenAPI.

Cons

The model as exposed to the templates heavily separates the operation from the models. There's some ability to work around this in the OpenAPI spec and has been done for query parameters in the PoC, however it's not yet clear if it's possible to do this reconciling body schema and endpoint as well.
Mustache templates are (mostly) logic-less, reducing ability to manipulate model & output. Can be worked around if you extend/modify generator on code side.
Appears currently only possible to get close to existing client code but not identical, without modifying actual generator. This may be OK for new client code, but of concern for replacing existing code.

Other Potential Approaches

The following approaches may be worth exploring to bridge the gap between the two current PoCs:

Modify Smithy Build Plugin To Utilise Templates

Provides only slight differentiation from current Smithy Build Plugin approach, in that it would allow for making some of the code generation more human friendly, but at the cost of having to integrate templating and mapping a cleaner model to pass into Mustache.

From Scratch OpenAPI Generator

Create a minimal CLI tool that uses existing OpenAPI model libraries to load and parse the spec, manipulating it and then passing into Mustache templates. Gives benefits of operating off of OpenAPI without the architecture restrictions of the openapi-generator tool.

Desirable Attributes

There's likely some common desirable attributes we have for whatever the solution ends up being, and I think it'd be good to solidify those.

Is having completely identical (or at least compatible) code with existing a requirement? As at some point in future we intend to have the whole client generated. How concerned are we with breaking changes? Breaking changes would make a gradual approach infeasible as every updated API would be a major version bump.
Regardless of schema definition language, any hacks/workarounds for getting our desired generation output should live solely within the generator outside adding necessary non-functional metadata to the schema. i.e. the schema should always be a true and valid representation of the RESTful API, as we desire to generate clients in many languages, and ideally sections of the server itself. This also enables any users to pick up the specs and use an off-the-shelf generator generate their own working client if the desire without special knowledge of any workarounds we've performed.

reta commented 1 year ago

@Xtansia thanks a lot for putting everything together, I think the summary is quite fair. The only things I would suggest to add to OpenAPI pros: The OpenAPI spec could be used to generate large number of language/client combinations out fo the box, I am not sure how that would translate to Smithy. Thank you!

dblock commented 1 year ago

Is there some consensus on what to choose? I think the way we decide is that the person doing the work makes the call.

wbeckler commented 1 year ago

Not only is a diversity of approaches possible, it may be necessary, because smithy can't directly generate most languages. For example, there's a first attempt at Ruby based on OpenAPI: https://github.com/opensearch-project/opensearch-ruby/issues/139

The question here may be just whether the smithy + gradle plugin approach is the right approach for incremental java client API generation. As a byproduct of this approach, the clients for non-smithy languages would rely on an OpenAPI spec generated from the Smithy models.

reta commented 1 year ago

Is there some consensus on what to choose? I think the way we decide is that the person doing the work makes the call.

I don't think we have a clear winner :D (as in most cases in tech), one of the most convincing benefits (for me at least) of using OpenAPI - you just need to know one thing (which is an open specification), and this is it (whereas with Smithy, you have to know both).

The question here may be just whether the smithy + gradle plugin approach is the right approach for incremental java client API generation. As a byproduct of this approach, the clients for non-smithy languages would rely on an OpenAPI spec generated from the Smithy models.

That is very possible, the thing I am personally concerned is that we would definitely run into Smithy -> OpenAPI generation issues and essentially would have to deal with that.

Xtansia commented 1 year ago

Is there some consensus on what to choose? I think the way we decide is that the person doing the work makes the call.

I don't think we have a clear winner :D (as in most cases in tech), one of the most convincing benefits (for me at least) of using OpenAPI - you just need to know one thing (which is an open specification), and this is it (whereas with Smithy, you have to know both).

The question here may be just whether the smithy + gradle plugin approach is the right approach for incremental java client API generation. As a byproduct of this approach, the clients for non-smithy languages would rely on an OpenAPI spec generated from the Smithy models.

That is very possible, the thing I am personally concerned is that we would definitely run into Smithy -> OpenAPI generation issues and essentially would have to deal with that.

What main issues do you foresee with using Smithy as the definition language and then converting to OpenAPI, other than the adoption/learning hurdle for Smithy?

Do you know of any commonly used definition language for OpenAPI that don't involve writing thousands of lines of YAML/JSON directly? 😅

reta commented 1 year ago

Thanks @Xtansia

What main issues do you foresee with using Smithy as the definition language and then converting to OpenAPI, other than the adoption/learning hurdle for Smithy?

This is yet another generator (== code), right? And I don't know at least 2 things: a) how good it is b) does it support OpenApi 3.1 (this would be very handy for Json Schemas)

Do you know of any commonly used definition language for OpenAPI that don't involve writing thousands of lines of YAML/JSON directly? sweat_smile

:D the good part - it is rarely needed to modify anything at all (I do have real world examples of that from the past), but we want things "our way" (for a reason), so here we are :smile:

nhtruong commented 1 year ago

Here's my take on Smithy vs OpenAPI.

Smithy is superior in terms of features and maintainability. However, it has not been widely adopted like the well-established OpenAPI. It lacks parsers for many languages, and most contributors are not familiar with it.

I used to think that Smithy was a lot more readable (well it is, from the API Design standpoint). But weirdly enough, when writing the PoC generator for Ruby, I much preferred looking at the JSON object representing an endpoint in OpenAPI over its Smithy equivalence. I had the overview of the entire endpoint in one place in OpenAPI VS referencing several files in Smithy.

Where do we go from here?

1. Smithy all the way (for both spec and code generation)

This requires contributors to learn Smithy and its limited toolset to be able to contribute to the project. If I were a programmer looking to contribute to an Open Source project and learn something new along the way, I'd more likely pick an OpenAPI project over a Smithy one:

OpenAPI is a lot easier to learn than Smithy: good old Json/Yaml vs a new language.
OpenAPI is a more marketable skill: There are way more companies using Swagger OpenAPI than Smithy.

More importantly, we might risk alienating developers who have experience in maintaining OpenAPI specs, and in writing code generators for OpenAPI.

The biggest advantage of this approach is utilizing all of Smithy unique features without worrying about whether they are translatable to OpenAPI.

2. Only use Smithy as a tool to write OpenAPI Spec more efficiently

In other words, we keep maintaining the opensearch-api-specification repo as a Smithy project with Gradle plugin to generate OpenAPI Spec. We will avoid Smithy features that are not translatable to OpenAPI.

Maintainers for the code generators can just ignore Smithy and focus on the generated OpenAPI Spec.
Contributors who want add new features to OpenSearch along with new endpoints will still have to learn not only Smithy but also the "culture" of the Smithy spec repo (file structure, gradle, testing...).

As @Xtansia has pointed out, writing JSON and YAML for OpenAPI from scratch is a pain. Smithy's Mixin and Traits make it a lot easier and faster to maintain than editing OpenAPI Json/Yaml files. Though there are 3 counter points that weaken the advantages of this approach:

With how ubiquitous OpenAPI is, most IDEs, whether natively or via plug-ins, provide code-completion, navigators, validators, and such that make editing OpenAPI files less painful.
API Designs should be well thought-out from the get go. Updates to an existing API endpoint interface should be far and few in between to avoid breaking changes. So, we won't see a lot of changes to the repo once we have all existing endpoints defined in the spec.
According to the Developer Guide of the spec repo, when you add a new API to the Smithy spec, you will also have to generate an JSON representing the endpoint, and write a JavaScript test for that endpoint. That is, with the current setup, it's not as simple as just updating the Smithy spec.

3. OpenAPI all the way (scratch off what we have so far with Smithy and start over)

Even though we spent quite a bit of time setting up the Smithy spec repo (You can see a lot of care for this project in the Developer Guide <3), there are barely any endpoints in the spec, and most are incomplete. So, starting over with OpenAPI is not too wild of an idea. In fact, this might be the best approach, in my opinion:

It removes the "learn Smithy" obstacle for all contributors.
It reduces the complexity of the repo: A simpler file structure, no gradlew build, no Github action to translate Smithy to OpenAPI ...
Most importantly, we can leverage ElasticSearch's rest-api-spec and translate it to OpenAPI spec programmatically (It's a lot harder to write a program to translate that to Smithy spec). This will save us a lot of time back-filling existing endpoints, and cut down on human errors from writing the spec by hand. (Note that said rest-api-spec lacks info on Response and Request Body, as @Xtansia pointed out in our discussion. While many clients like Rust and Ruby do not need this info, strongly-typed language clients like Java and the future TypeScript client require it. We will still need to add this data manually afterward.)

reta commented 1 year ago

Thanks @nhtruong

Most importantly, we can leverage ElasticSearch's rest-api-spec and translate it to OpenAPI spec programmatically (It's a lot harder to write a program to translate that to Smithy spec). This will save us a lot of time back-filling existing endpoints, and cut down on human errors from writing the spec by hand. (Note that said rest-api-spec lacks info on Response and Request Body, as @Xtansia pointed out in our discussion. While many clients like Rust and Ruby do not need this info, strongly-typed language clients like Java and the future TypeScript client require it. We will still need to add this data manually afterward.)

It is funny you mentioned that, I just posted the same kind of question / comment here https://github.com/opensearch-project/opensearch-api-specification/issues/74#issuecomment-1424308290 this morning. Yes, @Xtansia is very right, the rest-api-spec has no bodies but here is what I came up with:

request / response payload has to be reverse engineered in any case
since rest-api-spec is based on JSON schema, we could complement it with, ... well ... with JSON schemas of request / response payloads (this is just a $ref)
OpenAPI v3.1 could use these JSON schemas as-is - no changes required

nhtruong commented 1 year ago

@reta it's quite a funny coincidence. I discovered that ES spec while studying this commit to review this PR. Told @wbeckler about it yesterday and he told me this morning that you just brought up the same thing 😆

Anyway, it's cool that OpenAPI 3.1 can do that. I'll have to look into it. But I have a hunch that we should still convert it to native OpenAPI spec to make it easier to maintain, more consistent with the spec for future endpoints, and more compatible with off-the-shell parsers. (Again just a hunch. Still need to do research on it)

Xtansia commented 1 year ago

Just wanted to share I'd started looking at writing a PoC "from-scratch" OpenAPI based generator, as I'd mentioned in a previous comment. It's still not quite at the point I want to create a draft PR, but if you want to have a look, it's pushed up here: https://github.com/Xtansia/opensearch-java/tree/experiment/codegen-openapi-scratch/java-codegen

nhtruong commented 1 year ago

@reta I'm working on converting the old OpenSchema API Spec to OpenAPI Spec. I misunderstood what you said by using the JSON schema. JSON schema is only meant for the request/response bodies, not the entire API spec right?

reta commented 1 year ago

thanks @nhtruong

JSON schema is only meant for the request/response bodies, not the entire API spec right?

Correct for request/response bodies, but the spec has own schema here [1]

[1] https://github.com/opensearch-project/OpenSearch/blob/main/rest-api-spec/src/main/resources/schema.json

Xtansia commented 1 year ago

I have raised a draft PR with my working PoC for the from scratch OpenAPI based generator: https://github.com/opensearch-project/opensearch-java/pull/366

sachetalva commented 1 year ago

Hi,

Out of the options listed in this response(https://github.com/opensearch-project/opensearch-java/issues/284#issuecomment-1424822276), I think the 2nd option might be a good way forward, i.e we can define the API models in Smithy and for use-cases which require OpenAPI we can consume the OpenAPI build output from Smithy.

We (myself and pgtgrly) just wanted to add some more context on why we had proposed to use Smithy instead of OpenAPI for the opensearch api specification repo (https://github.com/opensearch-project/OpenSearch/issues/1590).

Some of the key reasons were:

One of the major issues with OpenAPI schema is the readability and maintainability aspect of the model. With Smithy, it becomes very easy to read and create new API models.
Smithy provides integration with OpenAPI out of the box. All smithy models can be converted to OpenAPI as a build output (this is already configured in the OpenSearch api specification project). This provides us the best of both worlds, for use cases where OpenAPI is more suited we can use the generated OpenAPI model files. An example of this is the Continuous integration framework (https://github.com/opensearch-project/opensearch-api-specification/blob/main/DEVELOPER_GUIDE.md#defining-test-case-for-api-model) which runs in the opensearch api specification project and leverages Integration test framework that uses OpenAPI.
Smithy is extensible via “Traits” which we can apply in the model itself (ex: @enum, @length, @examples) and it provides a built-in validation system which ensures the model is as per the defined constraints.
Smithy is designed to support better code generation. Also, looking at the POC shared above (https://github.com/opensearch-project/opensearch-java/pull/318) for the Smithy based client generator even having a Smithy build plugin to get the Client generation for OpenSearch models seems to be achievable. An extract from the Smithy documentation:
- “Smithy is built for code generation and tools. Smithy models were designed for the purpose of generating code for multiple programming languages. For example, Smithy models are completely normalized, which gives all generated types an explicit name and makes models easy to traverse, validate, and diff.”
Smithy is under active development and is continuously adding features, the most recent one being “Mixins” which greatly simplifies using same structures across multiple models.

In order to speed up API model generation in Smithy, it should also be possible to programmatically generate the Smithy initial API model files using the OpenSearch rest-api-spec JSONs (https://github.com/opensearch-project/OpenSearch/tree/main/rest-api-spec/src/main/resources/rest-api-spec/api). Once the inital model files are created, these can be enhanced to include the request and response bodies.

We would also like to add the discussion this thread by Disney Streaming about the reason they went for Smithy over OpenAPI for it’s highly customizable codegen capabilities (https://www.reddit.com/r/scala/comments/zftfwf/comment/izemlqq/).

nhtruong commented 1 year ago

I've generated the OpenAPI Spec from the legacy spec (with come caveats). The specs and some of my findings when working on this can be found here.

@sachetalva Thanks for the input. Some of the points you mentioned above are not unique to Smithy:

OpenAPI Also supports data "traits" like enum, minLength... via Schema.
OpenAPI, as a standard, also supports validation
OpenAPI also supports normalization via named schemas and references.

Regarding "[Disney Streaming] went for Smithy over OpenAPI for it’s highly customizable codegen capabilities", I think you mislinked the thread? I don't see it in the thread you quoted. Regardless, we will have to write our own client generators for OpenSearch due to the level complexity of its API and the current clients:

OpenSearch endpoints are often combined into one action/method
Endpoints are divided into namespaces, instead a flat structure

Sure we can customize an existing codegen to handle those, in theory, but at that point, I think it's a lot simpler to write our own generator to retrofit new endpoints into the current clients' structures.

nhtruong commented 1 year ago

One thing I just discovered when translating the legacy spec to OpenAPI spec is that, Smithy has another advantage over OpenAPI: The ability to group related endpoints into one location (Similar to that of the legacy spec). This can't be done with OpenAPI because of the restrictions on Path and Operation references.

dblock commented 11 months ago

@Xtansia what do we want to do with this issue?

Xtansia commented 5 months ago

Closing this issue as the spec is now solely authored in OpenAPI and #366 has been merged

opensearch-project / opensearch-java

[PROPOSAL] Experiment with using Smithy API spec to generate missing APIs #284

What/Why

What are you proposing?

What users have asked for this feature?

What problems are you trying to solve?

What is the developer experience going to be?

Are there any security considerations?

Are there any breaking changes to the API

What is the user experience going to be?

Are there breaking changes to the User Experience?

Why should it be built? Any reason not to?

What will it take to execute?

Options Explored

Pre-existing OpenAPI generator

Custom OpenAPI templates/generator

Custom Smithy Build plugin/generator

Unsolved Issues

Explored Approaches / Proof of Concepts

Smithy Build Plugin

Pros

Cons

Off-the-shelf OpenAPI Generator With Customised Templates

Pros

Cons

Other Potential Approaches

Modify Smithy Build Plugin To Utilise Templates

From Scratch OpenAPI Generator

Desirable Attributes

Where do we go from here?

1. Smithy all the way (for both spec and code generation)

2. Only use Smithy as a tool to write OpenAPI Spec more efficiently

3. OpenAPI all the way (scratch off what we have so far with Smithy and start over)