Open adrianchiris opened 3 years ago
As Support for new features are added to sriov-network-device-plugin, these features can:
1. add and/or modify functionality of device plugin 2. introduce new user-facing APIs (and possibly deprecate an old user-facing API) 3. support new "south bound" facing APIs e.g kubelet, device-info-spec (this is similar to 1 IMO as the API is already defined and device plugin just implements a new functionality)
For these features it may be desired to:
1. Introduce feature gates to enable-disable support and functionality in device plugin
so this feature gate will be a new cli cmd option, and the corresponding new feature will only be evaluated when this feature gate is true?
2. Mark these features (and their APIs) as experimental as they are not fully mature and their user-facing APIs may change in subsequent releases or removed altogether.
Is this to mark it as experimental or deprecated in the doc or in the code (e.g. warning message that user can be notified when configuring the feature)?
Since we are targeting new API process, do we really need featureGate? or just use the API created for that feature for enablement or disablement (e.g. if API is defined in cli or configMap, then it is enabled by user, vice versa)?
so this feature gate will be a new cli cmd option, and the corresponding new feature will only be evaluated when this feature gate is true?
Yes, thats what i had in mind. similarly to what is done in K8s feature gates.
We can have the feature gates as part of configMap as well (eg a new featureGates
field) if we want and cli flags will override those.
Is this to mark it as experimental or deprecated in the doc or in the code (e.g. warning message that user can be notified when configuring the feature)?
Im thinking the following:
maintain a table in docs for features which states:
For deprecated features we shall also have a deprecation warning emitted in log. a deprecated feature will be removed 2 releases or 6 months after it was marked as deprecated (the later of the two)
Since we are targeting new API process, do we really need featureGate? or just use the API created for that feature for enablement or disablement (e.g. if API is defined in cli or configMap, then it is enabled by user, vice versa)?
A new feature may or may not have a user facing API, hence i think we should have the separation.
@adrianchiris I think this needs careful thought before introducing - How much work involving for introducing this. Do you know of any example project that implements feature gates like Kubernetes? Id like to understand the code changes needed to turn off and off existing features. Also, if we want to go down this route and we need a consensus on depreciating a feature, can we agree on full acceptance from all maintainers?
Do you know of any example project that implements feature gates like Kubernetes? Id like to understand the code changes needed to turn off and off existing features.
Kubernetes is one example :) . ovn-kubernetes has some knobs that disables/enables functionality which is introduced in new features as well as some form of feature gate: https://github.com/ovn-org/ovn-kubernetes/blob/0a2aad51333400e28fbf51c1017b3064d61186cc/go-controller/pkg/config/config.go#L218
Also, if we want to go down this route and we need a consensus on depreciating a feature, can we agree on full acceptance from all maintainers?
My 2-cents here:
Alpha feature - removal should be by majority of maintainers (not deprecation here as it may be removed at the subsequent release). Beta feature - deprecation should have full acceptance from all maintainers. GA feature - deprecation should have full acceptance from all maintainers, although i suspect these will be deprecated if the feature is irrelevant, has alternatives and next to none consumers.
Not every feature will require a feature gate. As an example:
so it may be on a per-enhancement basis if a feature gate is to be added.
Downside of that is that the list of features tracked in docs and their state may be disjoint from the feature gates, unless we say that we track a state (alpha, beta, GA, deprecated) of a feature only if it has a feature gate. And a feature will have a feature gate if it is "substantial" . Evaluated per-feature depending of the impact on the project (API, behavioral)
Overall, I think this brings a lot of benefits
Downside of that is that the list of features tracked in docs and their state may be disjoint from the feature gates, unless we say that we track a state (alpha, beta, GA, deprecated) of a feature only if it has a feature gate. And a feature will have a feature gate if it is "substantial" . Evaluated per-feature depending of the impact on the project (API, behavioral)
I think just tracking the features for which there is a feature gate is reasonable. Other features should be documented elsewhere anyhow.
Some thoughts implementation-wise:
I guess this should mean we add a global Config
struct that is accessible from all over the codebase?
Should it live in it's own sub-package to avoid adding inter-package dependencies or could we reuse e,g: factory
?
What kind of logic will end up in such package? I'm thinking at least the struct initialization based on cli and configMap options (including solving potential feature interdependency). But this should be devicetype
-agnostic, meaning the activation of a feature cannot depend on, say, the network-specific capabilities of the node. I cannot think of any feature that might need this, but just saying it out loud.
Also, the current implementation uses []ResourceConfig
to unmarshall the configMap content, we'll have to add a global config space, right? Or are you thinking of making the feature gates per-pool?
@amorenoz i think we have the same idea about implementation approach
exact details can be discussed over the PR if we have a general agreement on adding support for feature gates.
implementation points:
Config
structs in its own sub-package to avoid storing a reference in all relevant internal structs. im thinking to avoid use factory
.featureGates
top level member which will affect device plugin functionallity in a global manner. If a certain feature has a per resource knobs its fine, but the feature is enabled/disabled globally.
- global
Config
structs in its own sub-package to avoid storing a reference in all relevant internal structs
@adrianchiris Does this mean a singleton basically?
Also, do you think that this new config should be added to the existing configMap, or should it be a separate configMap (e.g. renaming existing configMap.yml to, let's say, discoveryConfig.yml and creating addtional featuresConfig.yml?
Additionally I am not sure about having both Config
and featureGates
. Couldn't all the relevant information be stored in featureGates
itself? Just to avoid duplicating the values.
@adrianchiris Does this mean a singleton basically?
Yep
Also, do you think that this new config should be added to the existing configMap, or should it be a separate configMap (e.g. > renaming existing configMap.yml to, let's say, discoveryConfig.yml and creating addtional featuresConfig.yml?
I think it should be in the same Map, later on we can discuss if we need to separate the resource configuration from device plugin configuration.
Additionally I am not sure about having both Config and featureGates. Couldn't all the relevant information be stored in featureGates itself? Just to avoid duplicating the values.
Well a featureGate enables or disables a feature. A feature may or may not be associated with a set of Config parameters. so i dont think you can roll all of it into featureGates.
Just created a PR on this for initial discussion.
@adrianchiris @zshi-redhat @amorenoz @martinkennelly
I was supposed to present this today, but we ran out of time. So, to speed things up, I am attaching this simple presentation here.
Also, the concerns that are specified in the last slide:
FeatureGate configuration precedence - should CLI args override ConfigMap parameters? When DP is running as a pod it seems to be easier to change ConfigMap than CLI args.
Deprecation message – should emitted deprecation message be generic (e.g. WARNING: feature ABC is deprecated), or should be feature-specific (provided in config for example)?
Should there be special idioms like e.g. ‘allAlphaFeatures’ that would disable/enable groups of features (in this case disable/anable all alpha features)?
Should there be a way to block the feature disabling/enabling (e.g. there is a GA feature that user should not be able to disable)?
As Support for new features are added to sriov-network-device-plugin, these features can:
For these features it may be desired to:
Ive created this issue to trigger an open discussion on the issues and see if we can come with a clean way to handle it.