Improve testing processes for new features initially marked as experimental

For larger features or core changes to the system that are deemed to be higher risk, the project has grown a high-level process for graduating these features from development to GA:

Develop the changes behind feature flags using the feature flag framework so that the changes can be staggered across PRs/releases without any impact to users
Once all code and tests have been merged, remove the feature flag from the code and allow users to enable the feature through a configuration flag included into the experimental configuration section
Let the feature "bake" for a full minor release cycle to weed out issues, relying solely on testing signal from early adopters of the feature in the community
Enable the feature by default in the following minor release and either:
1. Remove the experimental configuration flag to make it the behavior for all users OR
2. Invert the flag so that users can optionally disable the new feature for a time period

Over time, some challenges with this process have emerged:

Low signal to maintainers about stability of experimental features since it depends only on community feedback from users who are interested/motivated to try out the upcoming changes
Confusion from developers in the community about what is the distinction between a feature flag and experimental config flag, and which mechanism is suitable depending on the use case

Some early thoughts/questions on ways that we may improve this process:

Testing

Historically the project has not run integration tests exercising features that are planned for graduation from experimental to GA. This generally feels like a miss, since the maintainers' primary view of project stability is through CI builds (PR, nightly, release).

Questions

If we ran experimental features in integration tests, in which build where should we run them?
1. Initial reaction is that nightly build seems appropriate since it doesn't disrupt developer flows and gives maintainers more frequent signal than release builds that may suss out race conditions / flaky behaviors
What kind of tooling / code support do we need to manage enablement of experimental features in CI?

Feature promotion process

Feature flags and experimental config flags are two disjoint means of disabling functionality by default. Working with both of them as a developer can be obtuse and hard to reason about.

Questions

Should we consolidate a subset of experimental config with the feature flag framework and consider it as a different "phase" of a feature flag?
1. This may also have added benefits with testing support since one framework can be used to turn on "in-progress"/"alpha" features.

+1 for the merging of experimental config into the feature config, with logging / warnings that an "experimental feature" has been enabled.

Benefits of the approach:

Configuration files don't need rewritten when the experimental feature becomes a stable feature.
Less variety of configuration files enables a easier path to testing the feature, experimental or otherwise.
The experimental section is a bit free-form, with servers happily accepting experimental features that don't exist (typo in previous db-event testing didn't raise errors).

Possible cons:

Configuration for features will need a bit more forethought, as there's no second pass to completely rewrite the configuration values.

If we move in this direction, I would suggest that we add an extra "--experimental" flag / config element which permits the process to continue running if an experimental feature is enabled, and without it, the server shuts down with an error detailing that "feature X is enabled in a non-experimental deployment"

spiffe / spire