yannh / kubeconform

A FAST Kubernetes manifests validator, with support for Custom Resources!
Apache License 2.0
2.22k stars 123 forks source link

kind `CustomResourceDefinition` is not validated against schema #100

Closed eyarz closed 2 years ago

eyarz commented 2 years ago

I took this YAML example for creating a CRD from the official K8s doc:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: crontabs.stable.example.com
spec:
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                cronSpec:
                  type: string
                image:
                  type: string
                replicas:
                  type: integer
  scope: Namespaced
  names:
    plural: crontabs
    singular: crontab
    kind: CronTab
    shortNames:
    - ct

When validating this YAML with kubeconform I'm getting the following error:

...failed validation: could not find schema for CustomResourceDefinition

Although the schema exists on the kubernetes-json-schema repo: https://github.com/yannh/kubernetes-json-schema/blob/master/v1.18.0/customresourcedefinition-apiextensions-v1.json

image

Copying the schema from the repo and passing it with the -schema-location flag works fine, so I guess the issue is when trying to parse the kind type...

image
wilmardo commented 2 years ago

I seem to be hitting this exact same issue. Running the latest 0.4.13

vDMG commented 2 years ago

This works with me using the latest v0.4.13 :

kubeconform -verbose  -strict -summary -schema-location default -schema-location "https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/master/customresourcedefinition.json"

But you're right for Kubernetes 1.20.0 there is no customresourcedefinition.json on https://github.com/yannh/kubernetes-json-schema but may be this issue should be moved to this other repo because it is not related to kubeconform directly.

eyarz commented 2 years ago

@vDMG I don't think this is the issue because customresourcedefinition.json exists on version 1.20.0: https://github.com/yannh/kubernetes-json-schema/blob/master/v1.20.0/customresourcedefinition.json

wilmardo commented 2 years ago

Hmm I am seeing this even without specifying a version, for example:

# kubeconform -v
v0.4.13
# wget https://github.com/fluxcd/source-controller/releases/download/v0.22.5/source-controller.crds.yaml
# kubeconform source-controller.crds.yaml
source-controller.crds.yaml - CustomResourceDefinition helmrepositories.source.toolkit.fluxcd.io failed validation: could not find schema for CustomResourceDefinition
source-controller.crds.yaml - CustomResourceDefinition buckets.source.toolkit.fluxcd.io failed validation: could not find schema for CustomResourceDefinition
source-controller.crds.yaml - CustomResourceDefinition gitrepositories.source.toolkit.fluxcd.io failed validation: could not find schema for CustomResourceDefinition
source-controller.crds.yaml - CustomResourceDefinition helmcharts.source.toolkit.fluxcd.io failed validation: could not find schema for CustomResourceDefinition

When I try it with the schema specified from master like @vDMG it works:

# kubeconform -schema-location default -schema-location "https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/master/customresourcedefinition.json" source-controller.crds.yaml
source-controller.crds.yaml - CustomResourceDefinition gitrepositories.source.toolkit.fluxcd.io is invalid: For field metadata.creationTimestamp: Invalid type. Expected: string, given: null
source-controller.crds.yaml - CustomResourceDefinition buckets.source.toolkit.fluxcd.io is invalid: For field metadata.creationTimestamp: Invalid type. Expected: string, given: null
source-controller.crds.yaml - CustomResourceDefinition helmrepositories.source.toolkit.fluxcd.io is invalid: For field metadata.creationTimestamp: Invalid type. Expected: string, given: null
source-controller.crds.yaml - CustomResourceDefinition helmcharts.source.toolkit.fluxcd.io is invalid: For field metadata.creationTimestamp: Invalid type. Expected: string, given: null
eyarz commented 2 years ago

@wilmardo the version is not related. I know it is working if I set the -schema-location flag (I even mentioned it in my issue). The bug here is that it should work without the flag because the CRD is part of K8s native schemas.

royhadad commented 2 years ago

Hi @yannh, I'm currently working on this issue, could you please assign me?

I found the reason it's missing - it's because versions with the "-standalone" suffix are missing the definition for customResourceDefinition in https://github.com/yannh/kubernetes-json-schema

Also, I notice that kubeconform always reaches out to the "-standalone" version. Could you please point out the difference between 1.18.0, 1.18.0-local, 1.18.0-standalone? When does kubeconform ever reach out for the 1.18.0/1.18.0-local versions?

eyarz commented 2 years ago

@royhadad here is the explanation about the diff between standalone and local.

I found this bug from digging around and based on your comment, I guess my bug is the same problem...

yannh commented 2 years ago

@royhadad frankly it's a great question ;) The kubernetes-json-schema repo - I mostly automated what was upstream and kept the same repository format. Maybe we could keep only the standalone folders? The repo is growing pretty fast. Maybe originally that repo was not meant just for kubeval...

royhadad commented 2 years ago

@yannh Assuming that only the "-standalone" and "-standalone-strict" files are used, removing the other types is definitely a possibility. But I had an idea - in case the file is not found in the standalone, search for it in the non-standalone as a fallback.

In any case, why is it then that the standalone files are missing customResourceDefinition? Is it a bug in the automated process?

How would you go about fixing this issue?

yannh commented 2 years ago

@royhadad not sure what the benefit would be vs adding the CustomResourceDefinition to the standalone folders. Yes, I am assuming it is a bug in the automated process. I guess I would try to see why https://github.com/yannh/kubernetes-json-schema/blob/a718ad35ec16742bb17e124de1ea40f8b2510ff1/build.sh#L30 doesnt generate the appropriate file :)

royhadad commented 2 years ago

@yannh I've done some digging and found this piece of code: https://github.com/yannh/openapi2jsonschema/blob/09bbcef0ed5f0f70ee033834637e10e7035b6787/openapi2jsonschema/command.py#L162

Basically, openapi2jsonschema (which is used by the build.sh script) doesn't support CustomResourceDefinition (along with some other kinds) while using the "-standalone" suffix

So the fix should be, if possible, to add support for those kinds in openapi2jsonschema. Seems like this code was written 4 years ago, do you manage to understand what was the problem in supporting those kinds? I see it has something to do with the fact they include json schemas themselves.

royhadad commented 2 years ago

Update: I have found an issue in the original openapi2jsonschema repo: https://github.com/instrumenta/openapi2jsonschema/pull/14 seems that the problem is because CustomResourceDefinition contains recursive refs, it is intentionally ignored for the "-standalone" and "-standalone-strict"

A workaround for now is to use the -schema-location flag twice, once with the default (standalone) location, and again without the -standalone suffix, and also without the strict suffix - as a fallback.

kubeconform -schema-location "default" -schema-location "https://raw.githubusercontent.com/yannh/kubernetes-json-schema/master/{{ .NormalizedKubernetesVersion }}/{{ .ResourceKind }}{{ .KindSuffix }}.json" ./customResourceDefinition.yaml

We'll implement this fix on our end for datree, implementing a similar fallback natively in this repo might cause some unexpected behavior, up to you @yannh

@eyarz Hope this answers your question! :)

hbouaziz commented 2 years ago

Hi all, @yannh @royhadad I'm facing the same problem when extracting the kind of a CRD. This is my script I used to generate the CRD Scheams:

#!/usr/bin/env bash

# Cleaning the Repos

rm -rf input
rm -rf schemas

# Create Input Folder for CRDs

mkdir input

# Download CRDs

while read -r crd 
do
    echo "Creating CRD: $crd"
    ResourceKind=${crd%%.*}
    echo "exporting item '${ResourceKind}'"
    kubectl get crds ${crd} -o yaml > "./input/${ResourceKind}.yaml" 2>&1 
done < <(kubectl get crds 2>&1  | tail -n +2)

# Creating the Appropriate Schemas for CRDs

wget https://raw.githubusercontent.com/yannh/kubeconform/master/scripts/openapi2jsonschema.py

mkdir schemas
cd schemas
python3 ../openapi2jsonschema.py ../input/*.yaml
rm ../openapi2jsonschema.py

kubeconform  -summary -output json -ignore-missing-schemas -schema-location default -schema-location './schemas/{{ .ResourceKind }}_{{ .ResourceAPIVersion }}.json' ../input/prometheuses.yaml

OUTPUT

image

The problem I get is whenever I precise the exact schema location path for the CRD (e.g. : -schema-location 'schemas/prometheus_v1.json' ./input/prometheuses.yaml) it works so fine.

However, when I try to put it dynamically I got a skip.

image


An other problem, is when I have multiple versions for one CRD, I got the same issue when parsing ResourceAPIVersion

I tried to look out how parsing is done in code and got nothing. So please If somebody can help here

royhadad commented 2 years ago

@hbouaziz

I can't see how your problem relates to the original issue. This issue is about the kind customResourceDefinition, while it seems you are having trouble with the custom resource definitions themselves.

hbouaziz commented 2 years ago

@hbouaziz

I can't see how your problem relates to the original issue. This issue is about the kind customResourceDefinition, while it seems you are having trouble with the custom resource definitions themselves.

Thanks @royhadad for your reply ! Indeed it has no relation with the original issue, sorry for that Please If you can share any similar solution to this issue or some ways to solve it
thanks

hbouaziz commented 2 years ago

No @eyarz it's not an extra prefix, in fact this what I should have as a value for .ResourceKind Variable It was just for testing

image However, if I change the kind value to prometheus it works correctly

image

eyarz commented 2 years ago

@hbouaziz this is why I deleted my answer, and you already replied before I had a chance to fix it 😅 the issue was that you had a discrepancy in the file name: you create the files as {{ .ResourceKind }}_{{ .ResourceAPIVersion }}.json but in your command screenshot you look for the file pattern prometheuses_{{ .ResourceAPIVersion }}.json

hbouaziz commented 2 years ago

Hi @eyarz again, Yes am creating files as {{ .ResourceKind }}_{{ .ResourceAPIVersion }}.json pattern, here is some of my output files :

clusterissuer_v1beta1.json
clusterissuer_v1.json
kafkauser_v1alpha1.json
kafkauser_v1beta1.json
...

Now the problem when I try to validate using the -schema-location flag with the same pattern scheams/{{ .ResourceKind }}_{{ .ResourceAPIVersion }}.json The {{ .ResourceKind }} doesn't match the exact kind I expect (kafkauser as example) Instead, I got a value = CustomResourceDefinition this is why with prometheuses I used the pattern prometheuses_{{ .ResourceAPIVersion }}.json so I can validate my yaml with the correct json schema (and I know it's not the right way)

eyarz commented 2 years ago

weird. I'm not able to reproduce :/ image are you running the latest version of Kubeconform?

@hbouaziz, I think it will be better if you will open a new issue with all the relevant details so we will stop "spamming" this issue...

hbouaziz commented 2 years ago

Hi @eyarz I created a new issue on https://github.com/yannh/kubeconform/issues/110