python-jsonschema / check-jsonschema

A CLI and set of pre-commit hooks for jsonschema validation with built-in support for GitHub Workflows, Renovate, Azure Pipelines, and more!
https://check-jsonschema.readthedocs.io/en/stable
Other
217 stars 40 forks source link

Validate usage of double quotes in expressions #151

Open awaelchli opened 2 years ago

awaelchli commented 2 years ago

Motivation

We use check-jsonschema in https://github.com/Lightning-AI/lightning/ to validate all our workflow files and it works very well. We have a GitHub action for this check here.

Recently we debugged an issue in our configuration that wasn't caught by check-jsonschema. The following yaml had double quotes in the expression:

    - name: Testing Warnings
      # the stacklevel can only be set on >=3.7
      if: ${{ (steps.skip.outputs.continue == '1') && ( matrix.python-version != "3.7" ) }}
      working-directory: tests/tests_pytorch
      # needs to run outside of `pytest`
      run: python utilities/test_warnings.py

which caused an error in the workflow:


The workflow is not valid. .github/workflows/ci-pytorch-test-full.yml (Line: 167, Col: 11): Unexpected symbol: '"3'. Located at position 68 within expression: (steps.skip.outputs.continue == '1') && ( matrix.python-version != "3.7" )

And it should have used single quotes like this:

matrix.python-version != '3.7'

check-jsonschema is useful for us because we can raise these warnings directly to the author in the PR.

Pitch

Would it be possible to have check-jsonschema validate against correct usage of single quotes vs. double quotes? I am not familiar with how check-jsonschema works, whether this is technically possible, or whether this is in the scope of this tool at all.

Borda commented 2 years ago

Hello, for completeness, we run the check with the build-in config:

check-jsonschema .github/workflows/*.yml --builtin-schema "github-workflows"

see: https://github.com/Lightning-AI/utilities/blob/c27592abab2b24578753794c9bf005bc7f627595/.github/workflows/check-schema.yml#L26

sirosen commented 2 years ago

Hi, thanks for the issue report!

The built-in schema is just a vendored copy of schemastore's github-workflows schema. So I think one appropriate fix here is to update that schema to forbid the use of bare double-quotes inside of expressions. check-jsonschema can then just update to the latest.

I haven't looked yet, but I would be a little surprised if this is an easy change in that schema. Is " never allowed inside of expressions, or is it possible that it can be valid in certain circumstances, just not this one? e.g. Is '"' a valid string there? My suspicion is that the current schema just regex checks that field, and that it's hard to write a regex to reject this which doesn't also have false-negatives.

You could add a second schema which you use to do this validation (e.g. hard reject " in expressions), and run check-jsonschema against that schema in addition to the built-in schema. You might be well served in writing such a schema by copying and pruning relevant content out of the current github-workflows schema. That, IMO, is the best workaround available today. No new code in check-jsonschema is needed, and you'll be able to reject double-quote chars. But it is a bit laborious.

I would also be open in theory to adding bespoke validation for GitHub expressions, but I currently don't have a lot of flex in my schedule to work on this. I'll continue to noodle on this and what appropriate fixes are possible.

awaelchli commented 2 years ago

I would also be open in theory to adding bespoke validation for GitHub expressions, but I currently don't have a lot of flex in my schedule to work on this. I'll continue to noodle on this and what appropriate fixes are possible.

Thank you for the reply. Completely understandable! I will forward your suggestions to my colleagues who can decide what to do. Afaik, this would be a nice-to-have for us, but not a priority item. We just wanted to raise this idea in case others can benefit from it 😃