python-jsonschema / check-jsonschema

A CLI and set of pre-commit hooks for jsonschema validation with built-in support for GitHub Workflows, Renovate, Azure Pipelines, and more!
https://check-jsonschema.readthedocs.io/en/stable
Other
211 stars 40 forks source link

Gitlab ci `!reference` in cache is not validating correctly anymore #498

Open balihb opened 1 week ago

balihb commented 1 week ago

Since the new update , for this code:

  cache:
    - paths:
        - ${PIP_CACHE_DIR}
      key:
        prefix: "pip-${PYTHON_VERSION}-${CI_JOB_NAME}-${TOX_VERSION}"
        files:
          - requirements.txt
    - !reference [ .apt:cache, cache ]

I'm getting this error:

  .gitlab-ci.yml::$.test.cache: [{'paths': ['${PIP_CACHE_DIR}'], 'key': {'prefix': 'pip-${PYTHON_VERSION}-${CI_JOB_NAME}-${TOX_VERSION}', 'files': ['requirements.txt']}}, ['.apt:cache', 'cache']] is not valid under any of the given schemas
  Underlying errors caused this.

  Best Match:
    $.test.cache: [{'paths': ['${PIP_CACHE_DIR}'], 'key': {'prefix': 'pip-${PYTHON_VERSION}-${CI_JOB_NAME}-${TOX_VERSION}', 'files': ['requirements.txt']}}, ['.apt:cache', 'cache']] is not of type 'object'
  Best Deep Match:
    $.test.cache[1]: ['.apt:cache', 'cache'] is not of type 'object'
sirosen commented 1 week ago

The underlying issue here is almost certainly that !reference is not supported by check-jsonschema. So the sample you gave is treated exactly the same as

  cache:
    - paths:
        - ${PIP_CACHE_DIR}
      key:
        prefix: "pip-${PYTHON_VERSION}-${CI_JOB_NAME}-${TOX_VERSION}"
        files:
          - requirements.txt
    - [ .apt:cache, cache ]

In the general case, I don't think !reference can be supported. In the narrow case of loading from a local path, we can add support, so I'm currently thinking of this as a feature request to that effect. See also #274

If you agree that this is the issue, I'd like to close as a duplicate of #274. If you don't agree, I think I'll need more information.

balihb commented 1 week ago

just treating it to match any type would be enough.

On Tue, Oct 15, 2024, 17:35 Stephen Rosen @.***> wrote:

The underlying issue here is almost certainly that !reference is not supported by check-jsonschema. So the sample you gave is treated exactly the same as

cache:

  • paths:
    • ${PIP_CACHE_DIR} key: prefix: "pip-${PYTHON_VERSION}-${CI_JOB_NAME}-${TOX_VERSION}" files:
      • requirements.txt
  • [ .apt:cache, cache ]

In the general case, I don't think !reference can be supported. In the narrow case of loading from a local path, we can add support, so I'm currently thinking of this as a feature request to that effect. See also

274 https://github.com/python-jsonschema/check-jsonschema/issues/274

If you agree that this is the issue, I'd like to close as a duplicate of

274 https://github.com/python-jsonschema/check-jsonschema/issues/274.

If you don't agree, I think I'll need more information.

— Reply to this email directly, view it on GitHub https://github.com/python-jsonschema/check-jsonschema/issues/498#issuecomment-2414348296, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAV4ZGFQ6LMT5EPFKIQ5UUDZ3UY3RAVCNFSM6AAAAABP7MFXOCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJUGM2DQMRZGY . You are receiving this because you authored the thread.Message ID: @.***>

sirosen commented 1 week ago

I'm not sure what exactly you're asking, but if you want !reference to act in some special way during validation, I don't see a good way to make it happen. (I can think of some bad ways. :joy: )

What we can do is handle YAML tags, like !reference, during parsing, and produce a document which better matches what the !reference syntax expresses.

Given

a: !some-tag b

It can be parsed as the equivalent of {"a": "b"} (if we ignore the tag), or we can define a function for !some-tag which will transform the input "b".

For !reference, the currently defined function is this bit of pseudocode:

reference(INPUT)
    check_type(INPUT, <array>)
    return INPUT

i.e. It doesn't do very much. We can make it do more, but whatever it outputs will be used as an input to the later validation phase, when the gitlab schema is evaluated against it.

There are all kinds of other manipulations which are valid to define, but they aren't all equally wise. Making !reference better match the GitLab implementation is almost certainly the best path forward.

balihb commented 6 days ago

I've raised a ticket for Gitlab about this too.

A !reference can be any type from here: https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-00#section-4.2.1

Resolving the reference is unfeasible since in a lot of cases it require authentication to access the source. For that there is an API to handle it and a specific tool.

I don't know if there is a way to generate something that matches any schema. That would probably be the best.

If not, if the schema is available during parsing, generating an element, that matches the required schema could work.

I don't have any good solutions either.

sirosen commented 5 days ago

It sounds like you're confirming that you would like check-jsonschema to allow an unknown !reference to pass without knowing its content.

The question of whether that's done with a special type, or by interpretation of the schema, is really a question of implementation. I can more or less rule out looking at the schema to generate a value -- such things are possible but I can't imagine that I would want to add it. For a sufficiently complex schema, it's more or less impossible to guarantee that the generated object would pass. i.e. That's a bad implementation path. Hooking the validators to skip on a special object is also a bit difficult, but more feasible. I think we would implement a custom validator class for GitLab's schema, and then the pre-commit hook would pass that in. It keeps the data transformation separate from validation.

Implementation of !reference loading for local files is fully within scope for this tool. It's part of what I would expect a resolution to #274 to implement. Perhaps, based on this request, a scoped-down implementation is worth pursuing. It shouldn't be too hard to add.


I'm not decided on which implementation paths to pursue right now. I'll have to give it some thought. Both local-file loading and a custom validator seem worthwhile. They may have interesting interplay.