Open felixfbecker opened 6 years ago
I think the best we could do until Kubernetes exposes the default value through OpenAPI is to maintain a hand-maintained list of default values in a map in the generator script, and apply that default value in the property definition.
Interesting! I wonder how kubectl
/ client-go
handle it. I take it the OpenAPI / Swagger document doesn't list these defaults? I'll take a look at both of these ideas this morning if I get time...
kubectl apply
circumvents the problem by only diffing maps instead of the struct types. I thought about this but I think it would severely impact the UX so I would rather hardcode defaults.
The default value is not exposed through OpenAPI but usually mentioned in the field description. I suggested adding this in the comment I linked above.
I don't know how client-go
handles this
I like your idea - let's do that.
So even if we have DefaultValue
attributes on everything and initialisers, it would still mean that if a YAML does not declare a field and the value is different than the default on the server, it will reset the value to the default value. I wonder if that is acceptable behaviour or not.
If we always ignore default values, it would be impossible to reset a value back to the default value.
Maybe we do need to make everything with a default nullable? So that null
means "not specified"/"ignore" while the actual default value means "reset". The OpenAPI spec is inconsistent about this too - some scalars are nullable in addition to having a default value, while others are not.
The question is just how inconvenient this would make using the API client, if even fields are nullable that are guaranteed to be set to a default value by the server.
Maybe we need to generate two sets of classes for API responses and for inputs? Where inputs just has more optional fields, so the API responses can extend the input classes and override the fields with default values to make them non-optional.
What do you usually do in C# in such cases? In TypeScript I can distinguish between null
and undefined
and scalars don't have zero values (undefined
is the zero value for everything).
Hmm, not sure about yamldotnet but JSON.NET has DefaultValueHandling so if the value is some well-known scalar value (or null) the member won't be serialised. Perhaps yamldotnet has something similar?
Serialisation is not the problem, it's generating the patch (diffing yaml vs server state)
Ah. Yeah I could apply the DefaultValue attribute and you could look for that?
You could treat that default value as "missing" or "unspecified" perhaps?
If the attribute is.missing th
Man, I hate doing GitHub on my phone :)
If attribute is missing there is a way to determine default value for type but I'd have to look it up.
Here you go:
Yeah, but if you always ignore the default value, then it would become impossible to set the field to the default value again after it was modified on the server...
Here's another solution I just came up with:
Generate setters for every field (and backing properties) that will mark the property as dirty when set in a HashSet that can be queried publicly. E.g.
class KubeModel {
protected HashSet<string> dirtyProperties = new HashSet<string>();
public IImmutableSet<string> DirtyProperties {
get { return dirtyProperties; }
}
}
class DeploymentSpec extends KubeModel {
private int progressDeadlineSeconds;
public int ProgressDeadlineSeconds {
get { return progressDeadlineSeconds; }
set {
progressDeadlineSeconds = value;
dirtyProperties.Add("ProgressDeadlineSeconds");
}
}
}
It would be more code but it's generated, so doesn't really matter. This would not require changing types to nullable and also remove the need for maintaining a map of default values and even non-updateable properties (#29), because those properties would never get set by the YAML deserialisation. The diff implementation would ignore any properties that are not dirty.
The only other solution I can think of is to deserialise to Dictionaries, then build PSObject
s and return those, but "lie" about the OutputType
in the Cmdlet to make autocompletion still work. While building PSObject
s, the "virtual" type needs to be added as a TypeName to make output formatting work.
Then make the patch cmdlet convert back to dictionaries first, and make the diff implementation "track" the "virtual" model type of the dictionaries by starting at KubeResource
and traversing through the Type
as its traversing through the dictionaries, so it can retrieve the patch strategies etc from attributes.
The disadvantage is that strictly Get-Kube*
cmdlets now return a completely different type than ConvertFrom-KubeYaml
.
What do you think?
Interesting - what about just using the YAML document as-is; it only specifies fields that are specified after all :)
This is starting to get a little complex but I'm not against modifying the models in principle (or creating some sort of helper that can do the same). Let me have a think and see what I can do...
What do you mean with as-is? Dictionary? We need to give it some type to serialize into
What if we generate a second set of models in parallel to the first (in a namespace like KubeClient.Models.Tracked
) which have the tracking behaviour and implicit / explicit cast operators (or AsTracked()
/ AsNonTracked()
) to convert back and forth? The actual code for tracking changes can be improved using the [CallerMemberName]
attribute (or whatever it's called).
Not for the purposes of diff perhaps? Can't remember for yamldotnet but JSON.NET has JObject / Jtoken (a DOM for JSON, effectively). I imagine yamldotnet must have something similar? So I mean don't deserialise at all if you don't need to...
Yep, they have a DOM for YAML:
https://github.com/aaubry/YamlDotNet/wiki/Samples.LoadingAYamlStream
Yeah, but we can't operate on the YAML string ;) I think you mean what I wrote in my second idea:
The only other solution I can think of is to deserialise to Dictionaries, then build PSObjects and return those, but "lie" about the OutputType in the Cmdlet to make autocompletion still work. While building PSObjects, the "virtual" type needs to be added as a TypeName to make output formatting work. Then make the patch cmdlet convert back to dictionaries first, and make the diff implementation "track" the "virtual" model type of the dictionaries by starting at KubeResource and traversing through the Type as its traversing through the dictionaries, so it can retrieve the patch strategies etc from attributes. The disadvantage is that strictly Get-Kube* cmdlets now return a completely different type than ConvertFrom-KubeYaml.
If we go with separate class hierarchies, we'll have the same problem of fragmented types...
Yep, they have a DOM for YAML
I believe the patch implementation shouldn't rely on the encoding being YAML though, so Dictionaries would be the better choice.
If we go with separate class hierarchies, we'll have the same problem of fragmented types
A fair point.
We could, however, change the client so that its methods take / return a JObject
/JToken
(or a YamlDocument
) instead of specific model types (i.e. no serialisation / deserialisation at all). Clients can then just use JObject.ToObject<TModel>()
/JObject.ToObject(modelType)
if required.
I'm pretty comfortable with JToken
and friends being passed around (it's a reasonably common way to pass untyped or semi-typed data around when that data was received in JSON format in the first place but the schema is not fixed); there's even reasonably high-fidelity conversion between YamlDocument
and JSON:
https://github.com/aaubry/YamlDotNet/wiki/Samples.ConvertYamlToJson
Seen this? https://github.com/wbish/jsondiffpatch.net
I wouldn't want to have PowerShell users needing to do that though...
I have this vision of cmdlets that can all take the same strongly typed Kubernetes model objects as input and all output the same strongly typed objects. Trying to find a solution that gets as close to that as possible 🤔
Seen this? https://github.com/wbish/jsondiffpatch.net
Yeah I've seen that, it's not JSON Patch (RFC 6902) though. Kubernetes API only supports RFC 6902 JSON Patch, RFC 7386 JSON Merge Patch and their custom Strategic Merge Patch. I therefor went with JSON Patch.
I agree that's the ideal way to go - but perhaps we might be able to simplify things a bit by keeping the 2 scenarios separate?
For 1, this is a moot point because all fields are already populated; by definition, only fields they change will be changed (and need patch operations generated for them).
For 2, we have a set of keys from the YAML so we know what patch operations are required.
Have I missed anything, or does that more or less cover both scenarios?
(so for 1, just do a property-level diff; all properties will have been populated before modifying)
(as for 2, I know JSON.NET has an operation called Populate
but I don't know if YamlDotNet does - populate simply deserialises on top of an existing object instance, only setting members for which serialised properties are present in the JSON - it's super useful)
It's also possible to wrap a given model instance in a PSObject
and augment / replace members that way (e.g. by adding change-tracking behaviour) although this may not be the simplest option...
What if ConvertFrom-KubeYaml
wrapped each deserialised resource model in a PSObject
with an added PSNoteProperty
called something like YamlKeys
or LoadedYamlMembers
? That way, you know which values were present in the Yaml to use as a hint for patching :-)
Mind you, if they load from YAML and then modify the resources before passing to Update-KubeResource
then that might not help so much :-(
(with the option above, the consumer always still gets real, strongly-typed models; it's just that ConvertFrom-Yaml
adds extra metadata to those deserialised instances to help guide Update-KubeResource
)
One big use case I have is actually loading YAML into an object, mutating it (in PowerShell), then applying it (sending a patch to the API).
For example this script in CI:
Get-ChildItem -Recurse -File *.yml | ConvertFrom-KubeYaml
image
fields in the PodSpecs to point to the tag of the currently built commit
| ForEach-Object { if ($_ is [DeploymentV1Beta]) { $_.Spec.Containers.Image = $commit } }
| Update-KubeResource
For writing this script, it would be tremendously helpful if I had IntelliSense on $_.Spec.Containers.Image
(i.e. it's strongly typed).
I couldn't find good examples of how to use PSObject
in C# to add the note property and also read it again, while still declaring types for parameters and output...
Here's a quick example (from memory; pretty sure it'll compile but not 100% certain):
class MyModel
{
public string Foo { get; set; }
}
class MyCmdlet
{
public override void ProcessRecord()
{
MyModel model = new MyModel
{
Foo = "bar"
};
PSObject result = new PSObject(model);
result.Properties.Add(
new PSNoteProperty("Name", "MyModel1")
);
WriteObject(result);
}
}
Intellisense will work on the result of the Cmdlet because it really is a MyModel
instance, just wrapped in a PSObject
(in fact, from memory all objects are wrapped in PSObject
under the covers but I might be misremembering that).
Additionally, if you decorate your Cmdlet class with[OutputType(typeof(MyModel))]
or [OutputType(typeof(MyModel), ParameterSet="MyParameterSet")]
then consumers will get intellisense before it's even run (e.g. when writing a pipeline).
I have an idea for attaching properties to arbitrary objects (CLR-level, not PSObject-level) - give me an hour or so and I'll post some code :)
Ok, forget that but how about this:
When deserialising from YAML, clone the deserialised model and store it as a PSNoteProperty
(called __YamlSource
or something like that) on the PSObject
-wrapped model written to the pipeline. That way you can compare the model's properties with the original to see which ones have been changed! Not certain but you might even be able to mark that property as hidden...
Can do this in C# to hide that property by default: http://ramblingcookiemonster.github.io/Decorating-Objects/
Yeah I think adding the noteproperty would work, but how would you read it from a cmdlet that expects KubeResouce
as a parameter?
Bah, didn't think of that :)
Maybe have it take a PSObject and use a custom validation attribute to verify the type?
Maybe add a parameter set that takes a different parameter of type PSObject from the pipeline?
You can still get at the wrapped model via the BaseObject
property.
That could work, we would just lose the type declaration in Get-Help
output with validation at runtime. That might be the trade-off we have to do.
So just to make sure I understand how this works:
What if when we loaded the resource spec from YAML we deserialised it on top of the spec loaded from K8s? Then comparing the 2 might be easier...
(props missing from YAML will be filled in from K8s and they make their changes to that model and when updating K8s all we need to do is load from K8s and compare).
Maybe if ConvertFrom-KubeYaml
had a switch to enable filling in of undeclared properties from current K8s state?)
Like this?
Get-Content -Raw deploy.yml | ConvertFrom-KubeYaml -OnTopOfServerState
So that would require ConvertFrom-KubeYaml
to have API access, and essentially would take the job of diffing and patching, because to deserialise "on top of" the server object it has to consider all the aspects of merge strategies etc. So really it just means merging cmdlets into one. It would be nice if we could keep some SoC
SoC?
How about a separate Cmdlet to do merge into model from K8s state?
I got diffing objects working well so far, but there is one problem: Several properties in Kubernetes objects are scalars with default values. For example:
int ProbeV1.FailureThreshold
defaults to3
string ServicePortV1.Protocol
defaults to"TCP"
and many more. These properties are often not declared in YAML configuration. But when deserializing the YAML, the properties get the C# type default value, instead of the server default value. I.e.FailureThreshold
is always0
,Protocol
is alwaysnull
. When diffing against the server, this will then generate a patch that sets these values to the C# default value (0
/null
).How should we handle this?
0
ornull
. Does the API have fields anywhere that can be set to the C#/Go default value? E.g. forFailureThreshold
the minimum is1
, and in other places where0
is valid I saw them use nullable/pointer types, e.g.ConfigMapVolumeSource.DefaultMode
. But there are a lot of booleans that are not nullable, and this would make it impossible to switch a boolean back tofalse
after it was set totrue
on the server.dynamic
instead of strongly typed objects for deserialisation. I would like to avoid this because then we lose the formatting and autocompletion in PowerShell, and it becomes way harder to look up the merge strategies to use for properties...