python-jsonschema / check-jsonschema

A CLI and set of pre-commit hooks for jsonschema validation with built-in support for GitHub Workflows, Renovate, Azure Pipelines, and more!
https://check-jsonschema.readthedocs.io/en/stable
Other
207 stars 40 forks source link

GitLab CI yaml vaiarbles failing #304

Closed KindDragon closed 1 year ago

KindDragon commented 1 year ago

My .gitlab-ci.yml start failing after I add variables to my job:

 variables:
  GLOBAL_VAR: "A global variable"

job1:
  variables:
    JOB_VAR: "A job variable"
    JOB_VAR2: "A job variable"
  script:
    - echo "Variables are '$GLOBAL_VAR' and '$JOB_VAR'"

job2:
  script:
    - echo "Variables are '$GLOBAL_VAR' and '$JOB_VAR'"

https://docs.gitlab.com/ee/ci/variables/

sirosen commented 1 year ago

What error are you receiving? Your file, as pasted here, isn't valid YAML, so it fails to parse (leading space on the first line). When I fix that and run with the gitlab CI schema, the file passes.

sirosen commented 1 year ago

Absent any further information, I'm closing this as user-error. I don't think there's a bug here -- at least, there isn't one that I'm able to detect. If there's more info that clarifies that this is a problem with check-jsonschema or the bundled schema, please feel free to reopen!

KindDragon commented 1 year ago

Sorry, misdiagnosed it. Two variables blocks cause it, but this works in gitlab ci.

unit-test-job:   # This job runs in the test stage.
  variables:
    GIT_SUBMODULE_STRATEGY: none

  stage: test    # It only starts when the job in the build stage completes successfully.

  variables:
    JOB_VAR2: "A job variable"
sirosen commented 1 year ago

That's only very, very marginally valid YAML -- I would simply say that it's surprising that gitlab's parser supports that. The correct resolution is to write your workflow such that you do not have duplicate keys.

YAML is versioned (a huge can of worms in its own right), but all three of the spec versions (1.0, 1.1, 1.2) have the same language requiring key uniqueness:

The content of a mapping node is an unordered set of key/value node pairs, with the restriction that each of the keys is unique.

- YAML 1.2 definition of node types - YAML 1.1 definition of node types - YAML 1.0 definition of node types

It's not clear what your document means, which is why the YAML parser correctly treats it as an error when parsed.

KindDragon commented 1 year ago

Hmm, ok but it isn't clear from pre-commit output what happened. Maybe at least it possible to improve error message to show line with error

sirosen commented 1 year ago

I think what you're looking for is covered by the -v/--verbose flag.

Given an invalid file of this form:

$ cat frob.yaml 
a: b
a: c

Here's the quiet output:

$ check-jsonschema --builtin-schema vendor.gitlab-ci ./frob.yaml 
Several files failed to parse.
  Failed to parse frob.yaml

and here's the verbose:

$ check-jsonschema --builtin-schema vendor.gitlab-ci ./frob.yaml -v
Several files failed to parse.
  FailedFileLoadError: Failed to parse frob.yaml
    in "/home/sirosen/.local/pipx/venvs/check-jsonschema/lib/python3.11/site-packages/check_jsonschema/instance_loader.py", line 31
    >>> data: t.Any = self._parsers.parse_file(path, self._default_filetype)

    caused by

    DuplicateKeyError: while constructing a mapping
    in "<byte string>", line 1, column 1:
      a: b
      ^ (line: 1)
  found duplicate key "a" with value "c" (original value: "b")
    in "<byte string>", line 2, column 1:
      a: c
      ^ (line: 2)

  To suppress this check see:
      http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys

      in "/home/sirosen/.local/pipx/venvs/check-jsonschema/lib/python3.11/site-packages/check_jsonschema/parsers/__init__.py", line 93
      >>> return loadfunc(data)

So the underlying error is pretty clear (although it contains an indented message, which would be nice to reindent correctly).

-v gets more detail than most people want most of the time; it's only sometimes what users will want full detail. And while I could add special casing for parsing errors vs other errors to make them more verbose, every special case carries a maintenance cost.

Balancing these sorts of considerations is more art and guesswork than it is science. My strong inclination until or unless proven otherwise is to continue to default the output to be summary and simple, and let users opt-in to detailed output with -v.

KindDragon commented 1 year ago

Oh, okay. Thanks for that!

It just wasn't obvious to me that "Failed to parse" could mean an error with the YAML structure or with matching the GitLab schema.