Open joestringer opened 1 year ago
Would it be reasonable to do this autodetection from the helm configuration level or using cilium-config
ConfigMap rather than executing into a Cilium pod to assess desired state from the agent?
yeah i was thinking we should just look at cilium-config config map 💡
FWIW, the PR adding the runtime feature detection (https://github.com/cilium/cilium-cli/pull/1022) has some context on why this is needed and why the information cannot easily be gathered from the ConfigMap:
Previously, we have been relying on the Cilium ConfigMap to detect the presence of features. Relying on the ConfigMap has however some downsides, as one has to know the default value of a certain feature if the option is absent from the ConfigMap. To make things worse, this default value is also version dependent in some cases.
/cc @gandro
that's a good point. i guess helm values will have the same issue? 💭
Yeah, I think we'll have the same issue with helm values. IIRC we don't specify defaults for them (in all cases).
struggle
There was also an issue with the default used for the tunneling feature, which tried to use the config-map value if available (https://github.com/cilium/cilium-cli/pull/1870/commits/da5c0c9eb9cff9e8baebabb0d9f16acd371c38db).
Therefore I tried to document the current guidelines in the code. I mostly copied the descriptions from the commits, bit I thought it makes sense to also have them in code (https://github.com/cilium/cilium-cli/pull/1870/commits/7c54c5fc7b20b26e4bc434a7b58635161d929b30).
I agree that relying on the agent-runtime-config.json
is a hack. But as already mentioned, we want to know the runtime configuration of the agent, not what has been configured via Helm or ConfigMap, since Cilium does a lot of auto-detection or runtime-dependent configuration of certain features. I opted to use the JSON file because it was already there and worked with existing versions of Cilium (something that the CLI also needs to consider).
I think one way answer here is to start exposing per-cell configurations in the Cilium API. We already have an API endpoint that exposes a subset the global runtime daemon configuration https://github.com/cilium/cilium/blob/95e25bfb132750e97304dfbe0f5770f4b6debd96/api/v1/openapi.yaml#L74-L85
The /config
endpoint is lacking per-cell configurations, but that can be solved by having cells exposing their configuration via API too. It could even be semi-automated, i.e. have a registry where cells register themselves as "having some configuration" which then automatically exposes that state via an API endpoint.
I wonder if we could put some infrastructure in place so that those configuration API endpoints are somehow tied to feature stability in the code. For instance, in the case from the issue description, the feature is brand new, so could be considered alpha. Maybe for alpha options we don't expose it in the API because it's subject to change. Then we wouldn't trigger this overall issue in the first place because Cilium-CLI wouldn't be relying on an alpha setting having a specific type or format. For other settings, we could graduate the features themselves to beta/GA and somewhere later in that process we start exposing the flags as a public API from Cilium.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
The following code uses a non-public API from https://github.com/cilium/cilium in order to extract configurations out from an individual agent:
https://github.com/cilium/cilium-cli/blob/e97bddd0e98139d70416371c50e3740625e9e4d7/connectivity/check/features.go#L274
There are a few issues with this:
connectivity test failed: unmarshaling cilium runtime config json: json: cannot unmarshal bool into Go struct field DaemonConfig.HubbleRedact of type []string
This code should probably unmarshal the configuration into a more liberal type to account for differences between Cilium versions.