Open marians opened 1 year ago
Hi, I've been considering adapting https://github.com/norwoodj/helm-docs to produce schemas.
Basically, it parses the yaml and generates a table with docs. Descriptions are pulled from comments and types are inferred or also pulled from comments.
Regarding validation, I think we have that in place already, abs runs ct lint
on the chart, which checks values against the schema.
@mcharriere Regarding generating schema. I think the approach of generating schema based on values.yaml is great for the beginning, but will get more and more complicated as the schema evolves. Having syntax in comments sounds like replicating JSON schema capabilities into YAML comments. But for what benefit? It may be simpler to
helm schema-gen
values.schema.json
as property annotations (title
, description
).values.yaml
purely for defaults (not for documentation of all config options)values.schema.yaml
by adding properties, adding annotations (e. g. examples
).We should see what tooling we can get (or develop) to support this workflow.
@mcharriere Regarding validation: yes, ABS validates values against schema.
When working more actively on our schemas, we need something more sophisticated. We need
title
for each property, must provide items
shema for array properties.)But for what benefit?
I'd say that not having to maintain 2 files (3 if you add the README) is one. And the lazy me would add not dealing with JSON.
For me, having just 1 source of truth is desirable. With this I mean that we could do it the other way around, generate values.yaml and the readme from the schema.
Of course, the goal should be to not have to maintain the same info in several places. Documentation (README) can be generated from JSON schema, too. And values.yaml
does not need descriptive comments if the README explains everything nicely.
Proposal towards re-usable schema:
giantswarm/schema as the repository to maintain schema in. See the README for details.
giantswarm/schema-server as the component which serves schema under nice URLs like https://schema.giantswarm.io/image/v0.0.1
https://schema.giantswarm.io/image/v0.0.1 as an example of a re-usable schema. Here is an example showing how such re-usable schema can be used in an app’s values.schema.json: https://github.com/giantswarm/dex-app/pull/236
Regarding schema linting, we have https://github.com/giantswarm/schemalint now to verify whether a JSON schema is valid.
This is already in use in https://github.com/giantswarm/schema (PR actions). Schema developers can also quickly install it via go install github.com/giantswarm/schemalint
and use it via schemalint PATH
.
The tool does not yet test against our own schema quality standards. It only ensures validity.
On normalization: schemalint
has the normalize
command now to normalize the whitespace and property sorting of a JSON schema file.
With https://github.com/giantswarm/schemalint/pull/7 we are adding the check for normalization to the schemalint verify
command.
Here is a slide deck which I've started creating today to help explain: https://docs.google.com/presentation/d/16McEeTiDPyVPglnvqguC3K98LxDB-R-S8mk2uhmAzkg/edit#slide=id.g1197062f044_3_0
This is a proposal for the high-level structure.
I liked the idea behind .experimental
.
I found a PR (https://github.com/giantswarm/cluster-aws/pull/192) by @AndiDog which replaces a values property and deprecates the old one. This is the first time I see this (in our context).
For clarification of the use of multiple types in a property I have created https://github.com/giantswarm/roadmap/issues/1892 for Rocket.
In talking about the proposed /experimental
category, we came to the conclusion that the category should also fit config that is internal (Giant Swarm only, not for customer use, no UI support), but not experimental in nature. So I'm renaming this to /internal
.
Planning note for this sprint
In the common schema proposal, /provider
got renamed to /providerSpecific
, as we found a naming conflict. Details
We just learned that there are side effects when using subcharts (aka library charts, dependencies).
If a cluster app schema has "additionalProperties": false
on the root level, and it is supposed to use a library chart like cluster-shared
, then there has to be a property /cluster-shared
in the schema of the cluster app.
This upstream issues explains some context: https://github.com/helm/helm/issues/10392
For our UI we most likely have to ignore/exclude /cluster-shared
again.
We just learned that ~app-operator~ cluster-apps-operator injects additional values into the schema. See https://github.com/giantswarm/cluster-azure/pull/78 for details. The properties needed are
/managementCluster
/baseDomain
/provider
All three are type string
.
For now, this means:
additionalProperties: false
on the root level, cluster creation fails.In the mid term I would prefer to move these properties into one of the other main sections, perhaps /internal. I'm sure there will be many effects to take care of. We'll have to talk to Honeybadger about this.
@mogottsch With https://github.com/giantswarm/cluster-aws/pull/239 I am adding temporary CI checks to cluster-aws, to be replaced by devctl once it's ready.
While I work on cluster app schemas in detail, I'm taking some more notes for future alignment steps here. Subject to change.
key=value
..controlPlane.node
, including:
Road block appeared. Obviously the KaaS teams have decided that going forward they want to use objects for node pools, with user-defined keys, using the JSON Schema feature patternProperties
.
Our UI library (react-json-schema-form) does not support this currently.
Internal discussion: https://gigantic.slack.com/archives/C03LV8E0RL7/p1679309519421109
@marians do you want to come up with an idea how to proceed here?
I think the main goals of this issue have been accomplished. We do have an agreement how to deal with the schema of cluster apps (it's mostly the app named cluster
by now).
However, there are still some ways to improve our processes in here which I think haven't been addressed yet. Probably we should bring this to KaaS sync or SIG architecture and discuss whether we should pursue those, and who.
@marians will you take this to KaaS Sync or SIG Architecture?
In Cluster API we create workload clusters via apps (as in the Giant Swarm app platform). We have https://github.com/giantswarm/roadmap/issues/1181 to provide a web UI for cluster creation, and with this we will rely on the cluster app schema to generate a UI.
This brings new requirements for our work with app schema. This issue exists to track our efforts towards defining these requirements, developing our tooling and enabling teams to work towards these requirements.
Tasks outline
Details and Goals
General JSON Schema knowledge sharing
It seems as if it's impossible to find a good talk on Youtube. So we'll do it internally.
Reusable schema
Goal: create synergies, avoid duplication in schema development between provider implementations.
In addition to establishing a repository, we will have to decide which reusable schemas to create and how to work towards using them. This may be an iterative, ongoing effort.
Requirements for cluster app schema
Goal: have clearly documented requirements for cluster app schema, which are understood by the KaaS teams, and which can be validated against.
Development and CI tooling
Goal: increase confidence and speed, reduce friction in the schema development process.
Schema generation
Goal: Simplify the creation of schema, compared to the
helm schema-gen
process, which creates only a scaffold.Normalization
Goal: produce minimal diffs in schema changes by avoiding purely cosmetic changes (think
go fmt
).Validation
Goal: Be certain that values.yaml validates against
values.schema.yaml
.Workflows
Goal: Establish ways how to deal with common challenges across teams.
Schema alignment
Goal: Users working with one provider should be able to transfer their knowledge to other providers easily. Also create synergies and avoid efforts.