spiffe / spire

The SPIFFE Runtime Environment
https://spiffe.io
Apache License 2.0
1.8k stars 476 forks source link

Improve testing processes for new features initially marked as experimental #4812

Open rturner3 opened 10 months ago

rturner3 commented 10 months ago

For larger features or core changes to the system that are deemed to be higher risk, the project has grown a high-level process for graduating these features from development to GA:

  1. Develop the changes behind feature flags using the feature flag framework so that the changes can be staggered across PRs/releases without any impact to users
  2. Once all code and tests have been merged, remove the feature flag from the code and allow users to enable the feature through a configuration flag included into the experimental configuration section
  3. Let the feature "bake" for a full minor release cycle to weed out issues, relying solely on testing signal from early adopters of the feature in the community
  4. Enable the feature by default in the following minor release and either:
    1. Remove the experimental configuration flag to make it the behavior for all users OR
    2. Invert the flag so that users can optionally disable the new feature for a time period

Over time, some challenges with this process have emerged:

Some early thoughts/questions on ways that we may improve this process:

Testing

Historically the project has not run integration tests exercising features that are planned for graduation from experimental to GA. This generally feels like a miss, since the maintainers' primary view of project stability is through CI builds (PR, nightly, release).

Questions

  1. If we ran experimental features in integration tests, in which build where should we run them?
    1. Initial reaction is that nightly build seems appropriate since it doesn't disrupt developer flows and gives maintainers more frequent signal than release builds that may suss out race conditions / flaky behaviors
  2. What kind of tooling / code support do we need to manage enablement of experimental features in CI?

Feature promotion process

Feature flags and experimental config flags are two disjoint means of disabling functionality by default. Working with both of them as a developer can be obtuse and hard to reason about.

Questions

  1. Should we consolidate a subset of experimental config with the feature flag framework and consider it as a different "phase" of a feature flag?
    1. This may also have added benefits with testing support since one framework can be used to turn on "in-progress"/"alpha" features.
edwbuck commented 10 months ago

+1 for the merging of experimental config into the feature config, with logging / warnings that an "experimental feature" has been enabled.

Benefits of the approach:

Possible cons:

If we move in this direction, I would suggest that we add an extra "--experimental" flag / config element which permits the process to continue running if an experimental feature is enabled, and without it, the server shuts down with an error detailing that "feature X is enabled in a non-experimental deployment"