thim81 / openapi-format

Format an OpenAPI document by ordering, formatting and filtering fields.
MIT License
77 stars 14 forks source link

Allow sorting schema properties by name #78

Open bodograumann opened 1 year ago

bodograumann commented 1 year ago

Currently it is not possible to sort properties, schemas or responses by name, because any order defined for them is applied to the nested objects instead. In order to distinguish between direct ordering of an object and nested ordering of its keys, I have introduced a new notation properties[*] here akin to JSONPath as used by Spectral. Ideally the sort order would use complete path expressions, but that will take much more effort.

I know this is a breaking change for anyone using a custom sort order, but I think it is worthwhile. The current implementation is limited and I found it difficult to understand how a custom sortFile even works. First I used trial-and-error to see what can and can't be sorted and then I looked at the source code. E.g. the default sort order defines content: [], which means sort anything under a content key alphabetically and it has properties: […], which means sort the fields of all objects nested inside the properties key.

ed-randall-blk commented 10 months ago

Please could you clarify, how do you configure it so that:

  1. The parameters on each endpont path are sorted by arg name
  2. For each parameter the property names are sorted as expected I thought I understood your patch and tried
    "parameters[*]": ["name", "description", "in", "type", "format", "required", "default", "maximum", "minimum", "schema"],
    "parameters": ["name"]

    and (2) is happening but (1) is not

thim81 commented 10 months ago

@bodograumann

Thanks for the PR, I still have to find time to review the code and the impact of the breaking changes. I understand the need to sort properties, schemas or responses by name, is the additional test https://github.com/thim81/openapi-format/pull/78/files#diff-2ed84511e150b344f99bcb395066fc8fd3b8a9e7580406e99a78ace4ee63fa85 a good example for me to fully understand the expected behaviour?

bodograumann commented 10 months ago

Good catch @ed-randall-blk. Turns out the current unit tests cannot actually test the order of keys in dictionaries, because the loadTest util parses the json/yaml file into an object before comparing actual vs expected. Let me try to fix this.

bodograumann commented 10 months ago

Ok, I added a sort implementation for arrays now, @ed-randall-blk (e.g. of parameters) Beforehand I only sorted objects by keys, e.g. properties. Giving the confusion this caused, I feel adding it is the right decision.

You are right, @thim81, that the new test should demonstrate the added/changed behaviour. I also added the limit parameter there now, so that sorting an arry of parameters is shown as well.

ed-randall-blk commented 10 months ago

This is really good. I think it's doing exactly what we need, your efforts are much appreciated, thankyou. To clarify our use-case: We have an openapi schema generated from python apispec; We deploy this config into Apigee via a PR process as a final confidence check as to what's changed (and to help with writing a release note). Unfortunately the order of everything in the .json has been indeterminate up to now, and the size has grown making the PR review almost impossible. Using this tool to massage the file into a determinate ordering will help no end.

ed-randall-blk commented 9 months ago

Two issues remain for our use-case:

  1. Is there a way to sort the "paths"? The paths aren't an array, and each name is unique...
  2. The "root" only sorts if there's an "openapi" key present; Some of our files are missing this but have "swagger": "2.0" instead.
bodograumann commented 9 months ago

You bring up some good ideas, @ed-randall-blk , but I think they would blow up the scope of this PR too much.

Sorting paths could work similar to the component order. There the definitions are ordered by the key. Unfortunately the paths are not contained within the components. So we need some new idea that won't make the configuration much more complex.

Not sure what is behind the second point, but I extpect it to be not too difficult to fix. Still better do it in a separate PR.

ed-randall-blk commented 9 months ago

Since writing that I have understood the behaviour of this better; 1) Can apparently get paths to sort by name simply by adding: "paths: []" I'd not have expected this from the documentation, I'd have expected the order to preserve, but I'm happy. 2) I first patched the code locally but later figured out how to make python apispec add an "openapi: 2.0" tag into the generated swagger.json doc to achieve this without any issues. We now have a working "normalization" config as:

{
  "root": ["openapi", "info", "servers", "basePath", "definitions", "parameters", tags", "paths"],
  "paths": [],
  "get": ["operationId", "description", "parameters", "requestBody", "responses", "summary", "tags"],
  "post": ["operationId", "description", "parameters", "requestBody", "responses", "summary", "tags"],
  "put": ["operationId", "description", "parameters", "requestBody", "responses", "summary", "tags"],
  "patch": ["operationId", "description", "parameters", "requestBody", "responses", "summary", "tags"],
  "delete": ["operationId","description", "parameters", "requestBody", "responses", "summary", "tags"],
  "parameters[*]": ["name", "description", "in", "type", "format", "required", "default", "maximum", "minimum", "schema"],
  "parameters": ["name"],
  "properties": []
}

we can run our generated files through this and review/compare changes consistently much more easily than before.

thim81 commented 9 months ago

@ed-randall-blk @bodograumann

Following up on the progress and the findings.

  1. Is the PR still needed or is using an [] the solution? If yes, I'll document this in the readme for future users. Since I'm cautious of introducing the new [*] syntax because it feels more complex to configure and would potentially break existing configurations.
  2. The "openapi" property is needed to detect the "ROOT" level to apply the sorting of the root elements, plus I assumed that the "openapi" would always be present in OpenAPI documents since openapi-format was supporting: OpenAPI 3.0 & 3.1. Is it compatible with "swagger": "2.0"?
bodograumann commented 9 months ago

Yes this PR is definitely still needed to be able to sort these things. E.g. properties is an array of objects. We want to sort this array in some way; according to a key in the objects. Then we also want to sort the keys in each property object in a fixed order. These two operations need separate configuration and without this PR it is not possible to sort the array at all.

I tried to keep the breakage to a minimum. Another approach would be to define a completely separate new sorting method, so users could opt-in to the new, more powerful approach, while existing use cases would not be affected. This would also allow us to do more drastic redesigns of the sorting configuration. That would require a bit more effort in terms of duplicating tests though.

What do you think, @thim81 ?

ed-randall-blk commented 9 months ago

The "swagger.json" (swagger 2.0) document that I'm working with seems fully compatible with the tool (for sorting purposes, at least) other than that it has "swagger: 2.0" instead of "openapi: 3.0" at the top; I tried to update the python to add both tags - but this fails the swagger validation :

- .openapi in body is a forbidden property

So it would be useful if openapi-format would recognise either openapi or swagger tag at the root.

On the sorting front, yes we've been using a fork with @bodograumann 's patch applied to achieve stable ordering / normalization of the file.

thim81 commented 6 months ago

hi @bodograumann & @ed-randall-blk

Re-opening the discussion. I had already foreseen a mechanism to sort by using the --sortComponentsFile option. By defining which elements that should be sorted, the CLI will sort the elements alphabetically.

Have a look at the tests:

Given that this option, will this solve your use-case?

bodograumann commented 6 months ago

I'll look into it, @thim81 , but it is really confusing when sortComponentsFile sorts thing that are not within the components key.

bodograumann commented 6 months ago

Let me be a bit more in-depth. Here are the things that should be sortable imho, together with some examples:

  1. Arrays of objects alphabetically by a value of a specific key in the children
    • parameters of a rest operation
    • top-level server and others
  2. Arrays of strings alphabetically by their values
    • enum options
    • required fields
  3. Objects with a fixed set of keys by listing the key order
    • top level info and others
    • any path (keys are the rest verbs like get: and parameters:)
    • any endpoint definition below the rest verbs
    • all kind of definitions that you would find under components, but also inline under the paths
  4. Maps with a variable key alphabetically by the key
    • properties of on object schema
    • responses with the response code as key
    • responseBodies with the content-type of key
    • all children of components

Going over the current openapi-format.js in main:

All in all I must say that this part of the code is quite convoluted and hard to work on. That is why I tried to clean it up a bit in this PR.

The main point of this PR was to solve the issue in the last two blocks. Any order defined for responses, schemas and properties is applied to the keys of their children. The parent objects themselves are never sorted alphabetically by their keys. That is why I introduced the [*] notation to make it clearer by which key (parent or self) we configure the order in (3). The issue persists on current main.

Later, by @ed-randall-blk's request, I also added support for (1), which might have increased the scope of this PR a bit too much, I must admit. My array sorting also allows sorting by multiple fields, which again might be overkill. (2) is not covered at all yet afaics.

What do yo think about introducing a completely new sorting configuration, besides the current one?

Sidenote: One other though was, that we might not have to be as generic in our sorting, but could only allow configuring orders for the exact types and attributes that occur in the openapi specification. This would also remove any redundancy e.g. between children of components.schemas and inline schema definitions, where we now have to define the key order two times. On the other hand this would bind us very tightly to the details of the openapi spec and any change there could have implications for us. The bigger concern would probably be assembling all the possible keys and types that could be configured instead of just going by arbitrary string keys. That is why I still prefer the current approach.