Closed afbjorklund closed 1 week ago
According to regular JSON Schema, using null
is not allowed when it comes to ints, strings (ptrs), arrays or maps...
So we need to document which are allowed to be null
(some are not), using a special jsonschema:"nullable" syntax
In the default.yaml, the value null is being used as a placeholder for those values that can be overridden or defaulted.
But the actual values will be more like "empty" (0
""
[]
{}
) than real null pointers, that can't be dereferenced.
But the actual values will be more like "empty" (
0
""
[]
{}
) than actual null pointers, that can't be dereferenced.
I'm not sure what you mean by "actual", but the values read from lima.yaml
will indeed include real null pointers. It is the responsibility of limayaml.FillDefault()
to perform the override.yaml
and default.yaml
merging operation, and then fill in any remaining settings with the builtin defaults.
So the result from FillDefault
should not have any null pointers in them, but the JSON file most certainly can and will.
FWIW, I found message
is the only top-level field that is a string
instead of a *string
, and I think this is a bug. Why should you not be able to specify a default message in default.yaml
?
With the "actual values", I meant those that will be used in the end - i.e. the ones that will replace the null value.
But currently the documentation (comment) said: 🟢 Builtin default: null
(changed it in 102e44cc705cccccb27166bbb5c3c6f0801235c1)
FWIW, I found
message
is the only top-level field that is astring
instead of a*string
, and I think this is a bug. Why should you not be able to specify a default message indefault.yaml
?
I think there were three of them:
They can probably be replaced by *string
, or maybe the cpuType can be commented out (null) like the others?
cpuType:
# 🟢 Builtin default: "cortex-a72" (or "host" when running on aarch64 host)
aarch64: ""
# 🟢 Builtin default: "cortex-a7" (or "host" when running on armv7l host)
armv7l: ""
# 🟢 Builtin default: "qemu64" (or "host,-pdpe1gb" when running on x86_64 host)
x86_64: ""
# 🟢 Builtin default: "rv64" (or "host" when running on riscv64 host)
riscv64: ""
The current result is that the keys will have to encoded anyway, no matter if it is empty string or null values.
So the default yaml could just have cpuType: null
(by commenting the default keys out, like networks etc)
mountPoint
is a field inside the Mounts
array, not a top-level field. It should be nullable, but we get away with it being a string because the empty string is not a valid value either, so we treat the empty string the same as nil
.
Same thing for cpuType
: the empty string is not a valid value.
Making them nullable would be more consistent, but I don't think it really matters.
I do think we should fix message
though, even though it seems like an improbable use-case.
It only mattered for the technicalities of jsonschema, I don't think users (or Go) really cared...
Otherwise the generated schema now "works", but it does not copy any documentation.
i.e. the Go struct is now undocumented, since all the documentation is in the default.yaml
Otherwise there is some feature, to copy the comments from the struct (but no "javadoc"*)
Typically using a syntax like ///
or /**
. But there is no such convention in Go, at least not as far as I know
It would only be useful if we actually wanted people to using the jsonschema for anything, like validation.
But then it would need to be hosted somewhere (an URL), even if one could hack it with raw.github.com
So the default yaml could just have
cpuType: null
(by commenting the default keys out, like networks etc)
We can do this anyways, as setting the value to null
will still result in the field being set to the empty string.
For example:
mounts:
- location: "~"
# Configure the mountPoint inside the guest.
# 🟢 Builtin default: value of location
mountPoint: null
mountPoint
is a string
, so it will be set to the empty string.
maybe the cpuType can be commented out (null) like the others?
They are already set to null
; maybe I misunderstand what you are saying:
cpuType:
# 🟢 Builtin default: "cortex-a72" (or "host" when running on aarch64 host)
aarch64: null
# 🟢 Builtin default: "cortex-a7" (or "host" when running on armv7l host)
armv7l: null
# 🟢 Builtin default: "qemu64" (or "host,-pdpe1gb" when running on x86_64 host)
x86_64: null
The issue is that I got jsonschema validation errors on those fields, since they are defined as string
.
examples/default.yaml::$.mounts[0].mountPoint: None is not of type 'string'
examples/default.yaml::$.cpuType.x86_64: None is not of type 'string' examples/default.yaml::$.cpuType.aarch64: None is not of type 'string' examples/default.yaml::$.cpuType.armv7l: None is not of type 'string'
It would only be useful if we actually wanted people to using the jsonschema for anything, like validation.
I don't understand why we need this? We already have limactl validate lima.yaml
; what does the json schema get us?
The issue is that I got jsonschema validation errors on those fields, since they are defined as
string
.
I'm fine with changing them all to *string
and updating FillDefault
to deal with it.
Similar to the cloud-config.yaml validation, it is just for following a technical standard. Not really meant for our users.
I don't think that it will ever be able to handle the features that are currently in the default.yaml and limactl validate...
Generating it from the code seemed like a low-maintenance approach, so was looking to see whether it was viable.
I will leave it in draft, I think it will be better once the NullableFromType
feature (for pointers) of has been merged.
Similar to the cloud-config.yaml validation, it is just for following a technical standard. Not really meant for our users.
That reminds me: I think you implemented the user-data
validation to run at runtime, so we would have to bundle the schema etc. I think this should at most be a build-time/CI thing but not included in the binaries.
After all, if there is a schema error in the generated files, there is nothing the user could do about it beyond filing a bug. So it essentially becomes a very heavy assert
statement.
I think the official approach (from cloud-init) is to run the schema validation at first boot, which is a bit "late" for us.
But you are probably right, it would be better off in somewhere like limactl validate
- or maybe just a build feature.
After all, the main benefit is finding holes in the official specification "standard".
I think the official approach (from cloud-init) is to run the schema validation at first boot, which is a bit "late" for us.
Yes, but these are just warnings in cloud-init-output.log
, so they don't break anything.
As I wrote above, showing the validation errors/warnings to the users is pointless because there is nothing they can do to fix them. It would just be noise to them.
After all, the main benefit is finding holes in the official specification "standard".
I agree that validating the files as part of CI and checking for errors would be useful.
But remember that we do have to accept deprecation warnings because we need to remain compatible with older cloud-init versions. So we can't just check that there are no warnings at all.
I will leave them as drafts for now, and cherry-pick the suggested (small) changes separately.
Changed it to run only on-demand, I think eventually it will just use an URL for the validation...
Expressing map[string]*string
is a "problem" in jsonschema, so left the map as null
instead:
# Specify desired QEMU CPU type for each arch.
# You can see what options are available for host emulation with: `qemu-system-$(arch) -cpu help`.
# Setting of instructions is supported like this: "qemu64,+ssse3".
# 🟢 Builtin default: hard-coded arch map with type (see the output of `limactl info | jq .defaultTemplate.cpuType`)
cpuType:
# aarch64: "cortex-a72" # (or "host" when running on aarch64 host)
# armv7l: "cortex-a7" # (or "host" when running on armv7l host)
# riscv64: "rv64" # (or "host" when running on riscv64 host)
# x86_64: "qemu64" # (or "host,-pdpe1gb" when running on x86_64 host)
This makes it more similar to other (rather) complicated defaults, like the containerd archives.
containerd:
# Enable system-wide (aka rootful) containerd and its dependencies (BuildKit, Stargz Snapshotter)
# Note that `nerdctl.lima` only works in rootless mode; you have to use `lima sudo nerdctl ...`
# to use rootful containerd with nerdctl.
# 🟢 Builtin default: false
system: null
# Enable user-scoped (aka rootless) containerd and its dependencies
# 🟢 Builtin default: true
user: null
# # Override containerd archive
# # 🟢 Builtin default: hard-coded URL with hard-coded digest (see the output of `limactl info | jq .defaultTemplate.containerd.archives`)
# archives:
# - location: "~/Downloads/nerdctl-full-X.Y.Z-linux-amd64.tar.gz"
# arch: "x86_64"
# digest: "sha256:..."
Trying to remember what the easiest way to call FillDefault in limactl
was, but I don't really recall.
EDIT: I remembered it was a patched version of validate
, will clean it up and make it into a PR
$ limactl validate --fill examples/default.yaml
INFO[0000] "examples/default.yaml": OK
vmType: qemu
os: Linux
arch: x86_64
images:
- location: https://cloud-images.ubuntu.com/releases/24.04/release-20240423/ubuntu-24.04-server-cloudimg-amd64.img
arch: x86_64
digest: sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3
- location: https://cloud-images.ubuntu.com/releases/24.04/release-20240423/ubuntu-24.04-server-cloudimg-arm64.img
arch: aarch64
digest: sha256:c841bac00925d3e6892d979798103a867931f255f28fefd9d5e07e3e22d0ef22
- location: https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-amd64.img
arch: x86_64
- location: https://cloud-images.ubuntu.com/releases/24.04/release/ubuntu-24.04-server-cloudimg-arm64.img
arch: aarch64
cpuType:
aarch64: cortex-a72
armv7l: cortex-a7
riscv64: rv64
x86_64: host
cpus: 4
memory: 4GiB
disk: 100GiB
mounts:
- location: "~"
mountPoint: "~"
writable: false
sshfs:
cache: true
followSymlinks: false
sftpDriver: ""
9p:
securityModel: none
protocolVersion: 9p2000.L
msize: 128KiB
cache: fscache
- location: /tmp/lima
mountPoint: /tmp/lima
writable: true
sshfs:
cache: true
followSymlinks: false
sftpDriver: ""
9p:
securityModel: none
protocolVersion: 9p2000.L
msize: 128KiB
cache: mmap
mountType: reverse-sshfs
mountInotify: false
ssh:
localPort: 0
loadDotSSHPubKeys: true
forwardAgent: false
forwardX11: false
forwardX11Trusted: false
firmware:
legacyBIOS: false
audio:
device: ""
video:
display: none
vnc:
display: 127.0.0.1:0,to=9
upgradePackages: false
containerd:
system: false
user: true
archives:
- location: https://github.com/containerd/nerdctl/releases/download/v1.7.6/nerdctl-full-1.7.6-linux-amd64.tar.gz
arch: x86_64
digest: sha256:2c841e097fcfb5a1760bd354b3778cb695b44cd01f9f271c17507dc4a0b25606
- location: https://github.com/containerd/nerdctl/releases/download/v1.7.6/nerdctl-full-1.7.6-linux-arm64.tar.gz
arch: aarch64
digest: sha256:77c747f09853ee3d229d77e8de0dd3c85622537d82be57433dc1fca4493bab95
guestInstallPrefix: /usr/local
hostResolver:
enabled: true
ipv6: false
propagateProxyEnv: true
caCerts:
removeDefaults: false
rosetta:
enabled: false
binfmt: false
plain: false
timezone: Europe/Stockholm
With "enum" on the string types, they can be checked and offfered editor completion (here code
):
Handy trick to remove all comments from a file: (previous workaround was json roundtrip)
https://mikefarah.gitbook.io/yq/operators/comment-operators#remove-strip-all-comments
Description
There was some discussion about using json-schema.org for documentation, which doesn't go very well...
But it could still be used for syntax-checking the lima.yaml, similar to what is available for cloud-config.yaml
https://github.com/lima-vm/lima/discussions/1489
https://github.com/invopop/jsonschema