Closed nhtruong closed 7 months ago
While the cat.health operation will be come OP_Cat_Health
I like the intent, but OP
feels strange. When are these names used, or are they purely internal? If they are internal, what's the benefit of adding OP
and not calling these the same as the REST path, including variable parts, e.g. Get_Cat_Health
and Post_Index_Search
(everything starts with a verb, everything matches REST path)?
We're simply putting all words used to describe component types in ALL CAPS.
CAPS are usually CONSTANTS in other languages. What do these things represent in OpenSearch code? They should match those names IMO.
Splitting structures.smithy into multiple files
Seems reasonable.
No shared default values
Makes sense. Again I would say these need to match what's in OpenSearch core more than have their own variation.
Organize shared data models by namespace/function rather than by types
Same as above. Anything shared has to also be shared in OpenSearch core, then it makes sense to share these models here.
This is no longer needed as we've migrated over to native OpenAPI
One of the biggest hurdles contributors have to deal with when contributing to this repo is its complex structure and rules. The ideas above aim to address some of such issues.
1. Naming of operations in the same group.
OpenSearch has many operations (i.e. url and http-verb parings) that serve identical purposes. They are grouped into operation groups. To distinguish one operation from another in the same group, we use their HTTP verb and differences in path params:
For example, the 4 operations of the
search
group are currently named:While the
cat.health
group, which only has 1 operation:But it can get very complicated with groups that have many operations or many path parameters.
Another downside of this is it's not clear if not clear what kind of component the model represents.
CatHealth
could be the name of theCatHealth
operation, or it could be the out of the operation.Proposed Naming Format:
OP_<namespace>_<group>_<optional_number>
The
search
operations would be comeWhile the
cat.health
operation will be comeOP_Cat_Health
This new naming format is a lot simpler and more clear on what type of structure is being defined. Note that the@http
trait right above each operation definition already states what http-verb and path params accompanying the operation.2. Naming of Operation Components
This is just a small change to make the namespace and operation name pop out more:
OP_Cat_Health_QueryParam
-->OP_Cat_Health_QUERYPARAM
OP_Search_1_Input
-->OP_Search_1_INPUT
We're simply putting all words used to describe component types in ALL CAPS.3. Splitting structures.smithy into multiple files
Right now, each operation group has a file called
structures.smithy
that holds all components of every operation in the group: shared query string params, qperation inputs and outputs, response bodies, and request bodies. Some of these files are quite large without the request and response bodies.Proposed files:
4. No shared default values
There are many path and query params that are repeated in different operations like index_name and node_id. While they have identical schemas, they often differ slightly in the descriptions and especially the default values. When defining the shared smithy models for these params, we should avoid adding the
@default
traits to avoid accidentally inheriting these values from the shared models.5. Organize shared data models by namespace/function rather than by types
Data models shared across repo are being placed in files like
While it works fine right now, once we add very complex data structures from response and request bodies, we will encounter reference hell where bit and pieces of a large data structures will appear in a dozen files.
Proposed solution
I don't have an exact proposal on how we organize these models right now, but the general ideal is putting related models as close together as possible as we divide them into namespace and functionality. It will become clearer when I backfill the repo with request and response bodies.
Note that all of these changes can be done programmatically while I backfill the missing request and response bodies.