opensearch-project / .github

Provides templates and resources for other OpenSearch project repositories.
Apache License 2.0
30 stars 70 forks source link

[RFC][PROPOSAL] Building a public roadmap for OpenSearch #196

Closed Pallavi-AWS closed 1 month ago

Pallavi-AWS commented 6 months ago

Update 20240912:

We have published a Roadmap blog here, please take a look and provide us feedback! https://opensearch.org/blog/opensearch-project-roadmap-2024-2025/


Is your feature request related to a problem? Please describe

The goal of this proposal is to streamline the process of contributions in OpenSearch to be able to build a forward looking roadmap. A roadmap can serve as a strategic plan with which the community will have visibility into the high level direction the project is headed and provide feedback. A forward looking open source roadmap will also provide a good collaboration avenue with the community and allow prioritization decisions to be taken with community inputs. It will help align contributor efforts towards product areas that the community prioritizes. The roadmap we will develop includes existing projects different open source teams are working on, as well as new projects they are planning for the future. I would love to have a discussion and get your feedback on this roadmap proposal.

Describe the solution you'd like

As part of this proposal, we can do the following:

1) All components/contributors will create RFCs for the roadmap items they plan to work on in 2024/25 and will tag them with roadmap:2024 or roadmap:2025 label. We can agree on a reasonable time window to accept comments on RFCs (say 2 weeks) before initiating a PROPOSAL with design details and a META issue with release milestones and execution plan. If a project is already in execution and we have a reasonable META issue to tie all features under the project, we can include the META directly in the roadmap. 2) We will finalize high level themes we want to club the investments under. These themes will translate into tags/labels. These themes are different product areas we want to highlight our innovation in and will help unify different features/projects under broader umbrellas. 3) We will build the OpenSearch roadmap by consolidating all RFCs and METAs that are tagged for 2024 and 2025 and organize them by the high level themes.

After the roadmap is consolidated from the RFCs or META issues grouped by themes, we will follow it up with a blog post with narrative around the roadmap that is sourced from maintainers and contributors.

Opening RFCs

The "RFC" (request for comments) process is intended to provide a path to propose ideas or significant changes in OpenSearch so that the community can be confident about the direction of the project. The template currently being proposed for OpenSearch RFCs is https://github.com/opensearch-project/.github/pull/192.

Do note that bug fixes and documentation improvements can be implemented and reviewed via the normal GitHub pull request workflow, and do not need RFCs. Before opening RFCs, you do not need to flush out end to end design, however, some thought needs to be put behind the idea so that the RFC does not get rejected by the community.

Lifecycle of a project in OpenSearch

The lifecycle of a project in OpenSearch can have 3 high level artifacts - RFCs (for WHAT and WHY), PROPOSALs (for HOW) and META (for WHEN).

For major features being proposed, you can get early feedback through other means such as public Slack before opening RFCs. Formulate the RFC documenting the motivation behind the proposal, what ideas will it solve, alternative approaches and high level design (only if you have it). We can agree on certain timeframe to seek feedback on the RFCs (say 2 weeks). We can also evolve this process to include voting on RFCs to decide on project prioritization.

For deeper design discussions, the suggestion is to use PROPOSALs (https://github.com/opensearch-project/.github/blob/main/.github/ISSUE_TEMPLATE/PROPOSAL_TEMPLATE.md).

After a project moves into execution, the suggestions is to close the RFC and open a META issue, covering the list of milestones and tasks, spread across releases. (https://github.com/opensearch-project/OpenSearch/issues/11522)

High Level Themes/Product Areas

I am proposing below high level themes/categories to start with that will convert into tags in open source. We can adopt outcome focused themes with each theme spanning features across multiple components.

This is just an initial list, and it would be great to hear back from you on any new theme we should introduce. The tags for these themes will be different than the sub component labels we recently introduced to organize operations in open source.

Creating the OpenSearch roadmap

Once we have the RFCs for planned projects in the future or META issues for projects that are already in execution tagged with 2024 or 2025, we can create the OpenSearch roadmap by consolidating all RFCs/META issues. We can create a default view of the roadmap using the high level themes described above (Sample roadmap format https://github.com/orgs/opensearch-project/projects/200/). Once we have consolidated the roadmap at the Project level, we can follow it up with a blog post containing narrative around the key themes the OpenSearch Project is innovating in.

peternied commented 6 months ago

@Pallavi-AWS Thanks for creating this RFC

Where do roadmap RFCs live?

All components/contributors will create RFCs [...]

I like this idea of being able to spot roadmap related RFCs - tactical question for you, where will these RFCs be created? Repos under the OpenSearch-Project could be viable; however I feel like that would also make it easy to lose RFCs that aren't tagged correctly and would prevent repositories that are outside the GitHub organization from participating directly.

I suggest they are created in a single repository to make browsing and monitoring the roadmap RFCs in a single location.

Roadmap item acceptance / rejection process

We will finalize high level themes [...] We will build the OpenSearch roadmap ...

I'm not sure who 'We' is in this proposal, or how they come to a decision. I believe this is the leadership committee, do I have that correct?

I think it would be useful to document workflow of this decision making process. As a developer I'm partial to using pull requests to document decisions - I would propose that decisions about roadmap items are codified by merged pull requests in a repository to include them with references.

This approach would make it easier to track what is currently slated on the roadmap(s) and facilitate discussion about its relevance and placement. If these document(s) representing the roadmap are in markdown it seems like they would be good candidates to publish directly on the project website.

Is this level of process visibility aligned with what you were thinking?

Roadmap format

I'm not sure the persona of the person viewing the project roadmap. As someone that is deeply connected to OpenSearch's core repository I imagine I would easily get overwhelmed with information that is off the screen or condensed too tightly. I am not a fan of using project boards for scenarios other than daily stand ups.

I think for the care and consideration our roadmap items would be better conveyed with prose that are as short or long as they need to be to get the message across. By using a format such as prose written in markdown, we can use graphs or visuals to help illustrate features related to Dashboards.

To step back for a moment; if the audience of the roadmap is largely the folks that are curating the roadmap then they should use the format easiest to use among themselves. If the audience is prospective contributors or adopters of OpenSearch it might be useful to make it as approachable as possible.

Maybe you could cite some roadmaps for other products that communicate the kind of detail you would like to see for OpenSearch Project?

Pallavi-AWS commented 6 months ago

@peternied this is a sample roadmap from Jenkins which aggregates key initiatives by themes. The audience for the roadmap is not just folks curating the roadmap, but also contributors interested in the long term direction project is going towards and finding opportunities to contribute/give feedback. RFCs/META issues can be opened at the individual repo level, and the OpenSearch release team will consolidate all RFCs/METAs from individual repositories to create the unified view of the roadmap. Let's start by curating all projects that OpenSearch contributors are already working on, or have planned in the near future. This will allow us to create a roadmap view, which is pretty difficult to gather today as we have not standardized on the process. In the next phase, we can get to a model where we prioritize based on a method we select for community voting.

anastead commented 6 months ago

Thanks Pallavi for driving this important discussion. I do have feedback on product areas, can we make it less granular? I was thinking Observability, Storage, Indexing, Query Engine and themes as Performance, Security

Pallavi-AWS commented 6 months ago

Thanks @anastead for the suggestion. We recently introduced new labels in OpenSearch to create components - search, indexing, storage, cluster manager, release & build etc. This was done to create component areas in core so that contributors and maintainers can triage issues in different areas they are experts in.

For the roadmap themes, my intention was to decouple from the newly created components and to create themes that reflect strategic objectives for the OpenSearch project (eg. improving performance, scaling horizontally, ease of use, Improving reliability etc.). Each theme will contain features that span across multiple components. We can definitely revisit the themes to make it less granular before we create the labels.

peternied commented 6 months ago

@Pallavi-AWS Thanks for referencing the Jenkins roadmap - it looks like it also had as strong backing document [1] explaining the details of the roadmap process. This looks like a great model to emulate.

What kinds of feedback are you looking for on this RFC? In the absence of specific areas of discussion you'd like to pursue consensus - I'd recommend creating a pull request of the roadmap guide to make the discussion concrete, what do you think?

elfisher commented 6 months ago

I'm glad we are exploring improving the roadmap and related processes, thanks @Pallavi-AWS! I do think there are a few things we want to account for.

  1. Roadmap items should focus on features and enhancements and not bugs, otherwise it will be too much information to follow. This was covered when we originally announced the roadmap https://opensearch.org/blog/opensearch-roadmap-announcement/
  2. I think we should separate "backlogs" and "the roadmap." Items in the roadmap should have intentions to actually build where as backlogs can be where we capture the longer tail of features and enhancements people want to see in the project.
  3. We can probably utilize our existing proposal template for RFCs. If someone wants to create an RFC without the proposed solution that's okay and similarly if they want to go all the way to proposing a solution that's also okay.
  4. I'd propose we keep the number of themes smaller to begin with and then add more later.
    1. Performance, Resiliency, and Scale
    2. Ease of use/Next gen Dashboards
    3. Observability/Log Analytics
    4. Security Analytics
    5. Search and ML
    6. Security
anastead commented 6 months ago

+1 on Eli's suggestions on up-leveling the items published and overall a few areas and themes.

samuel-oci commented 6 months ago

+1 on Eli's suggestions on up-leveling the items published and overall a few areas and themes.

another +1 :)

I suggest they are created in a single repository to make browsing and monitoring the roadmap RFCs in a single location.

This is an interesting suggestion, there were a few times that I struggled as well when I had to figure out where is the right place to publish an RFC when it touched on more than a single plugin.

I think it would be useful to document workflow of this decision making process. As a developer I'm partial to using pull requests to document decisions - I would propose that decisions about roadmap items are codified by merged pull requests in a repository to include them with references.

@peternied There are the meeting minutes of the leadership committee that are published, are you suggesting to put those into PRs and merge as the best way of publication? Or are you referring to the blog that explains the reasoning to be codified in GitHub instead?

AmiStrn commented 6 months ago

@Pallavi-AWS thank you for this proposal. It tackles many issues in an elegant solution. I would like to comment regarding @peternied's reference -

We will finalize high level themes [...] We will build the OpenSearch roadmap ...

I'm not sure who 'We' is in this proposal, or how they come to a decision. I believe this is the leadership committee, do I have that correct?

It has been discussed (before this proposal) in the Leadership Committee that perhaps we (the LC) should take a role in building the roadmap, designate North Star(s) for the project, or provide strategic direction to the project. However, after reviewing @Pallavi-AWS's proposal and @samuel-oci's proposal I have come to the conclusion that the both, working together, provide 99.99% of the requirements. I agree with @Pallavi-AWS's reply that building the roadmap out of the RFC's that are being worked on is correctly placed in the hands of those in charge of the release process. I believe that the project maintainer's coverage of correctly curated RFCs is what completes the picture.

For example, if someone wanted to add a feature that is esoteric to most users - who are we (the LC) to block them? After all, if someone is willing to pick up their keyboard (if you aren't imagining a key-tar you are now) and contribute something we should always encourage them. Any form of gatekeeping from the governance side is harmful, whereas @samuel-oci's proposal suggests technical parameters (prescribed by repo maintainers) as the bar of entry for new code, which makes perfect sense.

Then there is the question of the roadmap - what gets to be in the upcoming version? what is pushed to the next? these are both technical questions in 99% of the cases we could think of. and 100% of the actual cases we could gather from the past. Usually, the answer is semver (Is the feature making breaking changes or not?)

We thought maybe about the case where Product managers are thinking about some new feature and they are thinking of a strategy to roll it out and perhaps this will be in the "wrong direction" for the project. To that, I say - anyone can sit in a cave and make plans about the project's future, however, it only really matters to the project when they exit the cave and submit an RFC for the feature.

navneet1v commented 6 months ago

I'd propose we keep the number of themes smaller to begin with and then add more later.

  1. Performance, Resiliency, and Scale
  2. Ease of use/Next gen Dashboards
  3. Observability/Log Analytics
  4. Security Analytics
  5. Search and ML
  6. Security

@elfisher and @Pallavi-AWS I would like add Vector Search as one of the theme here. Here is my thought process: Right now if I see Vector Search is not fitting to be part of any of the above themes. ML is too generic for Vector Search and Vector Search scales and works very differently from normal text search hence putting it under search is also not fair as challenges are completely different.

Another thing is Opensearch is positioned as a Vector Database I see there is enough meat that it deserves a separate theme of its own.

Also @Pallavi-AWS I see in your original proposal you have put Advance search application and put 3 items under it Advanced Search Applications (Hybrid Search, Neural Search, Vector Search) we can probably rename it to be Search Relevancy(which will include Hybrid Search , Semantic Search neural search etc). Vector Search may be couple of years back was a Advanced Search Applications but with new wave of LLM its no longer an advance search application it deserves its own separate item.

cc: @vamshin

anastead commented 6 months ago

@navneet1v I am voting for keeping vector search in the Search and ML high level theme. We want to make sure we are proliferating with too many themes

anastead commented 6 months ago

@Pallavi-AWS thank you for this proposal. It tackles many issues in an elegant solution. I would like to comment regarding @peternied's reference -

We will finalize high level themes [...] We will build the OpenSearch roadmap ...

I'm not sure who 'We' is in this proposal, or how they come to a decision. I believe this is the leadership committee, do I have that correct?

It has been discussed (before this proposal) in the Leadership Committee that perhaps we (the LC) should take a role in building the roadmap, designate North Star(s) for the project, or provide strategic direction to the project. However, after reviewing @Pallavi-AWS's proposal and @samuel-oci's proposal I have come to the conclusion that the both, working together, provide 99.99% of the requirements. I agree with @Pallavi-AWS's reply that building the roadmap out of the RFC's that are being worked on is correctly placed in the hands of those in charge of the release process. I believe that the project maintainer's coverage of correctly curated RFCs is what completes the picture.

For example, if someone wanted to add a feature that is esoteric to most users - who are we (the LC) to block them? After all, if someone is willing to pick up their keyboard (if you aren't imagining a key-tar you are now) and contribute something we should always encourage them. Any form of gatekeeping from the governance side is harmful, whereas @samuel-oci's proposal suggests technical parameters (prescribed by repo maintainers) as the bar of entry for new code, which makes perfect sense.

Then there is the question of the roadmap - what gets to be in the upcoming version? what is pushed to the next? these are both technical questions in 99% of the cases we could think of. and 100% of the actual cases we could gather from the past. Usually, the answer is semver (Is the feature making breaking changes or not?)

We thought maybe about the case where Product managers are thinking about some new feature and they are thinking of a strategy to roll it out and perhaps this will be in the "wrong direction" for the project. To that, I say - anyone can sit in a cave and make plans about the project's future, however, it only really matters to the project when they exit the cave and submit an RFC for the feature.

@AmiStrn thank you for your thoughtful insights. +1

Pallavi-AWS commented 6 months ago

Thanks @AmiStrn @samuel-oci @anastead @elfisher @navneet1v @peternied for your review and suggestions. Can we agree on the following new roadmap theme labels in OpenSearch? I modified the themes logically to club key areas such as availability and modularization:

Cross cutting RFCs that span multiple repos can be opened at the OpenSearch project level or at the core - the roadmap curation process will pull the RFCs/metas from all repos and at the broader project level.

If we agree on the above, I will go ahead and create these theme based labels and work with the maintainers to have the RFCs tagged with the new labels. Very soon we can have our forward looking roadmap that is completely sourced from the contributions in the community and categorized by high level themes. Thanks.

anastead commented 6 months ago

I like these high level themes, i would combine modular architecture in the ease of use section and rename advanced search and ML applications to Search and ML applications

AmiStrn commented 6 months ago

regarding the themes, I think they are a good grouping. I agree with @anastead's comment about Modular Architecture being in ease of use, and anything that is not there would fall into performance or stability.

Regarding Cost I think that better performance is better cost only if you choose to scale down due to performance being good enough with less/smaller machines. Judged on their own the tasks are normally performance improvements. Were there RFC's targeting cost on its own? i.e. using S3 as remote storage has benefits for cost, but these rely on query pattern assumptions. CoPS as an acronym works nicely, and I have no objection to it, but it did stand out as an odd theme.

Regarding Security - how do we differentiate project security from the plugin? where do CVE's land?

What is the purpose of the high-level themes? If it for an executive view of the project roadmap then they are fine, but for automating actions you should consider that the themes are not disjointed groups of features. some features may be both CoPS and StAR. this may lead to people having discussions about where a feature should be. I would like us to avoid these empty discussions as they would spawn from an administrative constraint and not a technical one related to the project.

samuel-oci commented 6 months ago

I like these high level themes, i would combine modular architecture in the ease of use section and rename advanced search and ML applications to Search and ML applications

I like the themes as well :) Regarding the suggestion to combine ease of use and modular architecture/extendability, I think to me it makes sense to keep those separate for now since it sounds like they are serving different personas: Ease of use - users (e.g. dashboard) modular architecture, extendability - developer

AmiStrn commented 6 months ago

Ease of use - users (e.g. dashboard) modular architecture, extendability - developer

I'm convinced, +1 for that

Pallavi-AWS commented 6 months ago

@AmiStrn @samuel-oci - I will keep modular architecture/extensibility separate from ease of use as it potentially targets different personas (developer vs. user).

@AmiStrn - we will have RFCs that explicitly target cost reduction eg. new auto-tiering storage experience based on segrep/remote store/data lake integration targeting cost reduction. CoPS theme can include all price-performance-scale innovations.

Security will encompass default security posture hardening (eg. scoped down credentials to avoid cross cluster impact), CVEs cutting across components and advanced security features (eg. SAML enhancements) that will have changes in the security plugin, in addition to other core components. Themes need not be disjointed, and some RFCs can end up in more than one theme.

Thanks a lot for your passion and interest towards streamlining OpenSearch roadmap process @AmiStrn and @samuel-oci, really appreciate it. I will have the theme labels created.

samuel-oci commented 6 months ago

Thank you @Pallavi-AWS! I don't know how to sign an RFC description, but in the absence of such a button in GitHub consider this message my signature.

mch2 commented 6 months ago

Thanks @AmiStrn @samuel-oci @anastead @elfisher @navneet1v @peternied for your review and suggestions. Can we agree on the following new roadmap theme labels in OpenSearch? I modified the themes logically to club key areas such as availability and modularization:

  • StAR (Stability, Availability, Resiliency)
  • CoPS (Cost, Performance, Scale)
  • Ease of use (this will include Next gen Dashboards, Clients, Migrations)
  • Observability/Log Analytics
  • Search and ML
  • Security
  • Security Analytics
  • Modular Architecture

Cross cutting RFCs that span multiple repos can be opened at the OpenSearch project level or at the core - the roadmap curation process will pull the RFCs/metas from all repos and at the broader project level.

If we agree on the above, I will go ahead and create these theme based labels and work with the maintainers to have the RFCs tagged with the new labels. Very soon we can have our forward looking roadmap that is completely sourced from the contributions in the community and categorized by high level themes. Thanks.

Have applied this list to the core repo - https://github.com/opensearch-project/OpenSearch/labels?q=Roadmap. I don't have permissions to apply across all repos @bbarani is this something you can help with? I can provide the scripts.

anastead commented 6 months ago

+1 agreed @mch2

Pallavi-AWS commented 6 months ago

Thanks @mch2 for creating the new roadmap labels in core. @dblock, @rishabh6788, @prudhvigodithi, @gaiksaya or @peterzhuamazon do you have permission to apply across all repos? These are the base set of roadmap labels that will apply across all repos (not opt-in).

dblock commented 6 months ago

Tagging @opensearch-project/admin, we usually create labels across all repos using this

prudhvigodithi commented 6 months ago

Let me take care of this, I will post here once the labels are created.

prudhvigodithi commented 6 months ago

Let me take care of this, I will post here once the labels are created.

The labels are created for all the existing non archived (archived:false) repos. @Pallavi-AWS @dblock @bbarani

reta commented 5 months ago

Thanks a lot for formalizing the roadmap, @Pallavi-AWS , it looks great! One question I have though, in the previous discussions, we have agreed that RFCs, once the discussions are over and the agreement to move forward has been reached, are going to be closed in favour of META issues (or just issue sin case the changes are small enough). What would be the target for the roadmap in this case: RFC (closing it is not a deliverable) or META/issue (the actual deliverables)? Thanks again!

Pallavi-AWS commented 5 months ago

Hi @reta, after an agreement to move forward with an RFC has been reached, we should have a META issue related to the RFC with milestones/execution plan. Roadmap will be a combination of META issues (i.e. RFCs in execution) and RFCs that are still being discussed as ideas.

Pallavi-AWS commented 3 months ago

8 Roadmap Labels have been introduced across all repos now:

  1. Search and ML
  2. Observability/Log Analytics
  3. StAR (Stability, Availability, Resiliency)
  4. CoPS (Cost, Performance, Scale)
  5. Ease of use
  6. Security
  7. Security Analytics
  8. Modular Architecture

Next action item to tag RFCs/METAs/Feature Proposals with the new labels in order to pull them into the Roadmap Dashboard.

Pallavi-AWS commented 2 months ago

We'll break out Search and Vector Database/GenAI as separate top level themes for the roadmap.

1) Vector Database/GenAI 2) Search 3) Observability and Log Analytics 4) Ease of use 5) Cost, performance, scale 6) Stability, availability, resiliency 7) Security 8) Security Analytics 9) Modular architecture

getsaurabh02 commented 2 months ago

Thanks @Pallavi-AWS, it looks great! One question I had was regarding Hybrid Search items be tracked under "Vector Database/GenAI" or "Search" given the overlap.

reta commented 1 month ago

@Pallavi-AWS I think this issue could be closed since we released the roadmap to public [1] ?

[1] https://opensearch.org/blog/opensearch-project-roadmap-2024-2025/