SchemaStore / schemastore

A collection of JSON schema files including full API
http://schemastore.org
Apache License 2.0
3.02k stars 1.65k forks source link

consistent JSON formatting of schemas? #4072

Open rogpeppe opened 1 week ago

rogpeppe commented 1 week ago

Description of the feature / enhancement.

Some of the issues (#3102 holding some examples) could potentially be addressed by an automated change that reads each schema, applies some fix, and writes it back out again. However, that kind of automated change is awkward to review if the round-trip results in a complete reformatting of the file, making the important part of the diff hard to see.

I may have missed something (I hope so) but ISTM that if all the JSON files were formatted in a consistent tooling-friendly way, then such changes would be straightforward.

As a straw man suggestion, how about running all the files through jq . ? Then a user can always do the same thing before committing.

If there is such a formatting standard in place, then perhaps it could be documented in https://github.com/SchemaStore/schemastore/blob/master/CONTRIBUTING.md.

Are you making a PR for this?

No, someone else must create the PR.

hyperupcall commented 5 days ago

Yes! We use prettier to format the schemas. You can run Prettier locally with npm run prettier and run fixes with npm run prettier:fix. We have a little bit of configuration to put things like id, $id, $ref, etc. near the top of object types.

However, we do have an issue where pre-commit behaves weirdly with files paths like src/json/prettier/.prettierrc.json, src/json/prettier/prettier.yaml - so don't commit those if those are formatted. Once I get around to replacing pre-commit.ci with GitHub CI, that should no longer be an issue.

rogpeppe commented 4 days ago

We have a little bit of configuration to put things like id, $id, $ref, etc. near the top of object types

Ah, that's super interesting, thanks! Out of interest, perhaps you could explain how it works? I'm looking at:

          $schema: null,
          $id: null,
          $comment: null,
          $ref: null,
          '/^\\$.*/': null,
          '/^[^\\d+]/': 'none',
          '/^\\d+/': 'none',
          if: null,
          then: null,
          else: null,
          '/^[^\\d+]/': 'none',
          '/^\\d+/': 'none',

and I'm wondering why some of the rules are duplicated, and why there are special rules for keys that start with a digit or a plus sign. FWIW '/^[^\\d+]/' rule looks like to me the + in it might be a copy/paste mistake from the '/^\\d+/' rule above it.

I did a quick grep and while there are object keys that start with a digit, they're often the start of a hex string, so having a special case for a leading digit might lead to odd ordering with respect to other similar hex keys that happen to start with a letter.

But I might well be misunderstanding the configuration based on a very quick scan of https://www.npmjs.com/package/prettier-plugin-sort-json (which also seems to contain the same /^[^\d+]/ idiom, so perhaps that's where it comes from, although I'm fairly sure it doesn't work the way it's intended there, as unless it's using a different kind of regex engine, a + inside a character class just means a literal + character).

hyperupcall commented 2 days ago

and I'm wondering why some of the rules are duplicated, and why there are special rules for keys that start with a digit or a plus sign. FWIW '/^[^\d+]/' rule looks like to me the + in it might be a copy/paste mistake from the '/^\d+/' rule above it.

Yeah looking back at the configuration, I'm noticing some of the same issues as well. The duplicated rules before and after the { "if": null, "then": null, "else": null } are supposed to explicitly mean that there is no sorting before and after the properties with if, then, and else. But I'm not sure that properly translates since objects in JavaScript have unique keys so only the "bottom properties" probably show up in the object.

I believe there are two rules that start with digit and plus because they are both mean to disable the default formatting of numbers and "words". Numbers default to sorting by order and words default to sorting by lexicographic, so I think they each have to be reset individually.

I did a quick grep and while there are object keys that start with a digit, they're often the start of a hex string, so having a special case for a leading digit might lead to odd ordering with respect to other similar hex keys that happen to start with a letter.

That makes sense, I didn't think of that case when making the rules.

But I might well be misunderstanding the configuration based on a very quick scan of npmjs.com/package/prettier-plugin-sort-json (which also seems to contain the same /^[^\d+]/ idiom, so perhaps that's where it comes from, although I'm fairly sure it doesn't work the way it's intended there, as unless it's using a different kind of regex engine, a + inside a character class just means a literal + character).

Ohh you're right, the + is misplaced. It should definitely be a bug, I just copied it from prettier-plugin-sort-json.

Thanks for your feedback! The configuration can definitely use some improvement by working against these additional cases