fluxcd / pkg

Toolkit common packages
https://pkg.go.dev/github.com/fluxcd/pkg
Apache License 2.0
45 stars 84 forks source link

crdjsonschema action now creates a `all.json` schema to validate any Flux2 resource via a single schema #744

Closed kevinvalk closed 3 months ago

kevinvalk commented 4 months ago

This enables JSON schema extensions to validate Flux2 resources by pointing to the all.json file. Background: https://github.com/fluxcd-community/flux2-schemas/issues/23

I dropped some old Python2 compatibility code (iteritems) and I am using dataclasses (minimum Python version 3.7). I think this is not a problem because it is being run through the Dockerfile which is using python:3-alpine anyways.

Example usage in VSCode

When using the YAML extension you could use this new all.json schema like so (if this PR is merged and the flux2-schemas repo is updated):

.vscode/settings.json

  "yaml.schemas": {
    "https://raw.githubusercontent.com/fluxcd-community/flux2-schemas/main/all.json": "*.yaml"
  },

This would net you nice validation in (for example) VSCode for Flux2 resources: image

Schema validations are based on glob patterns and in the case of YAML for VSCode will not fallback when providing multiple root schemas. This means that in a typical GitOps repository (in which you mix Flux and Kubernetes resources) you either validate Flux resources OR Kubernetes resources. This sucks, but it can be fixed by stitching together Kubernetes schemas and flux schemas like so.

.vscode/kubernetes.json

{
  "oneOf": [
    { "$ref": "https://raw.githubusercontent.com/fluxcd-community/flux2-schemas/main/all.json" },
    {
      "$ref": "https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/master-standalone-strict/all.json"
    }
  ]
}

I experimented with including this kubernetes.json schema file as well, but that does not feel right and it probably makes more sense to leave this configuration step up to an end user. Especially because different JSON schema validation plugins will handle fallbacks etc differently.

.vscode/settings.json

  "yaml.schemas": {
    "kubernetes": "",
    ".vscode/kubernetes.json": "*.yaml"
  },

Setting "kubernetes" to empty string is required as the YAML plugin for VSCode always tries to match Kubernetes schema and if you do not do this you will get "Matches multiple schemas when only one must validate" errors.

stefanprodan commented 3 months ago

The main use case for the schemas is to enable validation in CI. Would this break kubeconform and the script used by almost all Flux users? https://github.com/fluxcd/flux2-kustomize-helm-example/blob/main/scripts/validate.sh

Can you please test the resulting JSONs with the script above and post the result here?

kevinvalk commented 3 months ago

TL;DR; No impact on validate.sh and this can be merged "as is".

The main use case for the schemas is to enable validation in CI. Would this break kubeconform and the script used by almost all Flux users? https://github.com/fluxcd/flux2-kustomize-helm-example/blob/main/scripts/validate.sh

Can you please test the resulting JSONs with the script above and post the result here?

Just verified (and also tested locally) but we are totally good here!

The presence of all.json and _definitions.json do not impact the validate.sh script at all (and all other files are untouched by this PR). Moreover, according https://github.com/yannh/kubeconform?tab=readme-ov-file#overriding-schemas-location if a directory is provided as -schema-location it expects to have a directory structure like https://github.com/yannh/kubernetes-json-schema. Fun fact, https://github.com/yannh/kubernetes-json-schema also uses the all.json and _definitions.json "system". So we are double good :)

Finally, I was just working on actually using this during development (linting at development time) and it really does NOT make any sense to include something like below. Especially, because you might have other CRDs that you want to validate in the future. So best thing is to document this somewhere (no clue where...) and let end users use this however they want.

{
  "oneOf": [
    { "$ref": "https://raw.githubusercontent.com/fluxcd-community/flux2-schemas/main/all.json" },
    {
      "$ref": "https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/master-standalone-strict/all.json"
    }
  ]
}

For completeness, I tested the impact of this PR on validate.sh by commenting out line 40 (curl ... | tar ...) and replace the directory of /tmp/flux-crd-schemas/master-standalone-strict with the all files straight from a fresh run of this script. And ran ./scripts/validate.sh; echo "Are we good: $?" from fresh clone of kustomize-helm-example.

Double checked if things are really working as expected by modifying HelmRelease schema kind to type number to see if it would fail. When "breaking" helmrelease-helm-v2beta2.json it indeed fails. Making the same change to _definitions.json has no impact. So this really shows that all.json is ignored.

stefanprodan commented 3 months ago

@kevinvalk please rebase with upstream main and force push, this PR should have a single commit. Thanks

kevinvalk commented 3 months ago

@kevinvalk please rebase with upstream main and force push, this PR should have a single commit. Thanks

Done (after a bit of a mess...)

stefanprodan commented 3 months ago

@kevinvalk I think the oneOf snippet should go on the flux2-schemas readme, we should also tell people in that readme what's the role of that repo.