Clarifying "experimental features" in OpenSearch [DISCUS][BUG]

CEHENKLE commented 11 months ago

What is the bug?

While we have a description of what an "experimental feature" feature can be used for in OpenSearch, we don't actually describe what the criteria need to be met to graduate from experimental to a general availability (GA).

Additionally, while we provide a general heuristic, we have plenty of features that are launched without ever being experimental, even those that could probably have benefited -- we leave the decision on both whether a feature should be marked as experimental and whether it's ready to graduate entirely up to the maintainers of the individual repos.

Finally, the OpenSearch Project does do security reviews for some features (you can see more information here). On the OpenSearch Project repos I've worked in, we (as maintainers) have considered a sign-off from AWS appsec following this process a requirement to graduate out of experimental for some features, but like much of the process around experimental features, that requirement not documented anywhere and is not universally adopted.

This has caused some confusion for contributors because the "rules of the road" have never been fully defined. So, I propose we define them :)

Here are a couple of questions to kick us off that are the top of my mind, but feel free to add your own:

0/ What purpose should experimental feature serve? Do we still agree with this description?

1/ Should the OpenSearch Project have rules about a) what features need an "experimental" phase and b) what a feature needs to graduate to GA at the project level, the repo level or something else?

Assuming we do want some kind of guidelines...

2/ Do we want criteria in place that, if your feature meets them, you must have an experimental phase?

3/ What criteria should we look at for graduation? Security? Scalability? Testability?

4/ Do we want to set a limit on how long a feature can be experimental?

5/ Who decided if those criteria have been met?

Please note: I'd like to focus this discussion on what (if any) criteria The OpenSearch Project wants to have around experimental/GA features. Downstream providers and products will have their own criteria, including AWS OpenSearch Service. We (OpenSearch Project) can certainly craft our criteria to make a GA launch on any particular service easier, but they should still be considered separate processes. Additionally, while a downstream project make make decisions based on us (e.g. "we will not expose any experimental feature to our customers"), we should not take a dependency based on them or their actions (e.g. "The OpenSearch Project will not mark a feature GA until XYZ Service agrees that it's GA").

WDYT?

elfisher commented 11 months ago

I personally like having it be easy to introduce experimental functionality so that people can easily ship features at an early stage to get feedback. Right now the guidelines say experimental releases are used to gather early feedback on features.

For release criteria, I do think individual repo maintainers may have different criteria they care about and so there should be flexibility for maintainers of their code to drive that definition.

peternied commented 11 months ago

AWS appsec following this process a requirement to graduate out of experimental for some features, but like much of the process around experimental features, that requirement not documented anywhere and is not universally adopted.

We have this process documented security reviews [1] on this .github project. Inside the security plugin, the security review process is followed carefully, such as for OnBehalfOf tokens [2], where there is a task specifically waiting on the 'sign-off' to be issues [3]. That said - I think we can do much better in documenting and following this for features because of course a maintainer of Security Plugin thinks this :D

peternied commented 11 months ago

I've just written and deleted a response for the fourth time to the define experimental question. Where I'm getting stuck is what is the user problem, why do we need the concept of experimental to have a consistent meaning?

nknize commented 11 months ago

Generally, in open source the leaner a process is, the better. Community members usually do not join meetings, so it's best to rely on good async habits (e.g., tldrs, scripts, lean issues, lite docs).

My experience and preference is to take it on a case by case basis with a contributor first initiating a "proposal" vote to graduate / promote a feature from "experimental" to LTS. Then use lazy consensus to vote and/or valid veto (required counter proposals and/or technical arguments).

Perhaps the more interesting discussion is what to do if/when the vote passes. I'm not sure this needs heavy process either. I think this is also a case by case basis but in general relies on good testing, benchmarking, and security review which we should have for LTS features anyway.

dblock commented 11 months ago

tl;dr I think we should add recommendations, but I don't think we should try to enforce anything.

0/ What purpose should experimental feature serve? Do we still agree with this description?

I'd keep. I like it as it has strong warnings around (not) using experimental features in any other way than "at your own risk".

1/ Should the OpenSearch Project have rules about a) what features need an "experimental" phase and b) what a feature needs to graduate to GA at the project level, the repo level or something else?

I would limit ourselves to recommendations. I believe most features should start as experimental because it's faster, and because new APIs have to follow semver, so some time as experimental gives more time to get it right with more user feedback.

2/ Do we want criteria in place that, if your feature meets them, you must have an experimental phase?

I would not, but would agree to have soft recommendations in that same doc.

3/ What criteria should we look at for graduation? Security? Scalability? Testability?

We already say that all new features need a security review. It goes without saying that any feature needs to be unit and integration tested, if we haven't spelled it, I'd add those words. Scalability is a good question, I think we can add some recommendations, but I wouldn't hold an experimental feature back if it works on a single node cluster only for example.

4/ Do we want to set a limit on how long a feature can be experimental?

I would not unless there are strong arguments to do so. It would force maintainers into removing experimental features. I don't see any advantages., all while maintainers are free to do so at any time.

5/ Who decided if those criteria have been met?

Maintainers of each repo.

opensearch-project / .github

Clarifying "experimental features" in OpenSearch [DISCUS][BUG] #178

What is the bug?