Content-aware schema detection

valorl commented 2 years ago

Is your enhancement related to a problem? Please describe.

The current way of mappings schemas to files is way too static for some use-cases. When working e.g. with Kubernetes manifests, it's not very common nor practical to follow a file-naming convention that would include the apiVersion/kind (= the schema determinant), which makes it hard to use YAML LS in practice.

Describe the solution you would like

Instead of just a file glob, I'd like to be able to involve the contents of files in the process of determining a schema. A mapping could be defined using a basic expression, e.g.:

"yaml.schemas": {
    ".apiVersion == argoproj.io/v1alpha1": "https://raw.githubusercontent.com/argoproj/argo-workflows/master/api/jsonschema/schema.json",
    ".apiVersion == apps/v1 && .kind == Deployment": "https://kubernetesjsonschema.dev/v1.14.0/deployment-apps-v1.json"
}

(The syntax doesn't really matter as long as it's possible to determine a schema based on YAML content)

I wonder what maintainers think. Does this sound as a feature that would be accepted or is it too much complexity? Are there any other ways of achieving this ?

Describe alternatives you have considered

Adding comments with schema links to each file. This is quite impractical to maintain in repos with a lot of similar files. Adding the comments could potentially be automated based on a mapping like above for a "quick fix" solution.

Additional context

apupier commented 2 years ago

Please note that if the CRD is deployed on your Kubernetes cluster, that you are connected to it and using VS Code Kubernetes and VS Code yaml, the schema is already resolved.

Having a way to define them via preferences as you mentioned or via API would be nice too. it will allow to have completion in offline mode. With the API, it will allow other VS Code extensions to provide the CRDs schema in non-connected mode.

For kubernetes files, it is effectively not very convenient to provide schema using modelines in comment as the version is already defined in a specific attribute. A way to workaround the lack of schema support in yaml specification.

valorl commented 2 years ago

Wasn't aware of this capability of the extension as I don't use VS Code. It's better than nothing of course, I'll try that out, but I definitely agree that offline support would be nice.

If it's configurable offline and purely via a language server, then it makes this capability more accessible to the majority of present and future editors. And I imagine Kubernetes access from laptops/workstations where you edit files isn't always an option.

Also I'd just like to point out that the Kubernetes ecosystem is just an example. This could be useful for plenty of other use-cases, like internal configs, application-specific configs, etc. These are more likely to have a specific naming convention and therefore solvable using the file-based mapping, but it's not uncommon even in those cases to have variations in schema which depend only on the contents. Just think versioning for example.

ehakan commented 1 year ago

Any updates on this issue?

My 2cents on why this is really needed.

While it is a kubernetes specific example, IntelliJs Kubernetes extension is able to recognize contents of a yaml file to determine if it's a kubernetes manifest. Then you can give it CRDs or random OpenAPI v3 specs to make it autocomplete on whatever spec you need, without ever needing to access the cluster. (Not that you can't do that too.)

I'm sure VSCode's ability to autocomplete by connecting to the cluster itself is useful, but not having the option to do it offline or without VSCode is really inconvenient.

Sure, inline comments for schemas exist but they have to be put on every single document. It's prone to human error, nothing is stopping someone making the mistake of putting the v1alpha1 schema for a v1 resource. It also pollutes the entire repo and looks ugly, but most of the time it's tolerable.

With GitOps becoming more prevalent we don't even give some roles cluster credentials, they just edit manifests in a git repo. In some scenarios, we don't even have a cluster yet, merging new manifests to the main branch of GitOps repo is what creates the cluster, without ever needing to touch kubectl.

TL;DR

Folks using IntelliJ can just register CRDs or connect to a cluster.
VSCode users need to connect to a cluster, which may or may not have the CRDs installed, or even exist.
The rest (neovim etc.) has to manually put inline schema comments on every single document.

msvechla commented 6 months ago

I have created the following PR that addresses this issue, by automatically detetecting the Kubernetes schema used and downloading it from the CRD Catalog: https://github.com/redhat-developer/yaml-language-server/pull/962

I'm using this in my neovim setup locally and it works great!

redhat-developer / yaml-language-server