opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.44k stars 1.73k forks source link

[RFC] Enhanced Collaboration Process through Architecture Decision Records (ADR) #12153

Closed kiranprakash154 closed 6 months ago

kiranprakash154 commented 7 months ago

Abstract

This proposal introduces a structured process for enhancing collaboration and decision-making within the project's repository by integrating Architecture Decision Records (ADR). The current challenge lies in the limitations of GitHub Issues for in-depth discussions, specifically the absence of inline commenting which restricts seamless conversation flow on specific topics. This RFC proposes a workflow leveraging GitHub's pull request (PR) mechanism to facilitate detailed, context-rich discussions akin to collaborative platforms like Quip or Google Docs, without introducing external tooling.

Problem Statement

GitHub Issues, while effective for tracking tasks and bugs, falls short in supporting nuanced, threaded discussions on architectural decisions or design proposals. The key limitations include:

These limitations hinder effective collaboration, resulting in discussions that are less engaging and harder to navigate. The proposed solution aims to address these issues by integrating ADRs into the project's repository, leveraging the PR review process for richer collaboration.

Proposed Solution

The solution involves creating a dedicated folder within the project's repository, named Architecture Decision Records (ADR), to host architectural and design proposals. The process will unfold as follows:

  1. Proposal Submission: Authors create a detailed proposal using a predefined template and submit it as a Markdown file within the ADR folder through a pull request. Proposals include design rationales, alternatives considered, and visual aids (e.g., diagrams or images) as necessary.

  2. Community Review: Once the PR is open, the community can review the proposal document. GitHub's PR review feature allows for inline comments, enabling precise feedback and discussions directly linked to specific parts of the proposal. This simulates the seamless, context-rich discussion environment found in platforms like Quip or Google Docs.

  3. Labeling for Organization: Relevant labels are applied to the PR to categorize and facilitate easier navigation through different architectural discussions. Labels can indicate the proposal's status (e.g., "under review", "accepted"), topic area, or any other relevant classification.

  4. Iterative Refinement: Authors address the feedback by making revisions to the proposal document. The community continues to provide input until a consensus is reached or the proposal is deemed satisfactory by the project maintainers.

  5. Decision Acceptance: Once all concerns are addressed and the proposal meets the project's standards and requirements, a project maintainer merges the PR. This signifies the official acceptance of the decision, integrating it into the repository's ADR folder as a permanent record.

  6. Status Tracking: The merged ADR includes a section for tracking its implementation status, ensuring the decision is not only recorded but also acted upon.

Status Categories

To facilitate clear communication and tracking of each ADR's lifecycle, the following status categories are defined:

Benefits

Conclusion

By adopting this ADR integration process, we can significantly improve the way architectural decisions are discussed, reviewed, and documented within the project. This approach not only enhances collaboration efficiency but also ensures that important decisions are transparently recorded and accessible, fostering a more inclusive and informed community.

kiranprakash154 commented 7 months ago

@msfroh @dblock @nknize @Bukhtawar @sohami thoughts ?

nknize commented 7 months ago

I'd keep RFC documents out of the source dev repo. I'd also bias towards a simple process a newb can figure out w/o reading a doc (i.e. link a google doc to a github issue).

msfroh commented 7 months ago

GitHub Issues, while effective for tracking tasks and bugs, falls short in supporting nuanced, threaded discussions on architectural decisions or design proposals.

I don't think that's the real problem we face. I don't think templates are the problem or the solution, I think the approach that people take is wrong. It's a culture issue and we need to fix the culture.

From my perspective, the problem is that developers (and while we shouldn't generally be pointing fingers, I'm going to be vocally self-critical and call out that it's developers from Amazon) dump massive fully-fleshed docs with sequence diagrams and defined milestones as "RFCs". Nobody is requesting comment with something like that. That's not how you conduct a collaborative design process.

When I'm faced with a giant wall of text in a GitHub issue, I have no way of understanding the implications or whether I should care. Generally, the first time that I seriously consider an idea is when I'm reviewing a PR, so I need to consider the impact to the codebase and the project.

If maintainers and other community stakeholders are involved in developing the ideas in the first place (rather than being given a massive "Hey, here's a fully fleshed plan that already went through internal design review, so we're gonna do it" doc, we can talk through those issues earlier.

harshavamsi commented 7 months ago

When I'm faced with a giant wall of text in a GitHub issue, I have no way of understanding the implications or whether I should care. Generally, the first time that I seriously consider an idea is when I'm reviewing a PR, so I need to consider the impact to the codebase and the project.

I would also like to point out the scenario when the opposite happens. There are issues created that are "tasks" as part of a meta issue. These issues usually have a sentence of two describing not really an issue but a "task" that a developer wants to prioritize. This is also painful imo since I have no idea what the dev wants to accomplish with the task and reviewing any change as part of the issue requires me to spend a lot of time trying to gauge the code to understand what is being accomplished.

I understand there might be issues that are fast follow fixes or issues that don't really require a lot of description, but when this happens for a vast majority issues, its a pain.

dblock commented 7 months ago

@kiranprakash154 What's an example that you've faced in which this process would have improved the outcome?

reta commented 7 months ago

I am with @msfroh & @nknize here, the ADR is by no means a bad idea but (to me):

nknize commented 7 months ago

If maintainers and other community stakeholders are involved in developing the ideas in the first place...

This is great in theory but not always the case. I don't expect the community to be involved in everything. That just leads to micromanagement. I'm okay with a company paying people to write and upstream a feature or change to core and I don't expect to be involved in that process. They're not paying me to be involved. I'm even okay with a lengthy proposal and corresponding PR(s) opened within minutes of each other (hoarding the code). This happens on Lucene all the time when Elastic, Solr, and Amazon developers code on their company's dollar. The open source mechanisms we have (or put) in place should prevent destructive or bad faith changes - not how those changes are designed or contributed. For example, if a lengthy design doc is opened, I'm not reading it. I will look at a PR, however. If that PR is 2K+ lines, I'm immediately requesting changes that the PR be split into smaller incremental changes so maintainers can digest the impact and/or consequences. Then I'm wearing my OSS hat when considering the contribution and not biasing toward anything other than the project itself (not any one company need). This is the culture that needs to be created w/ a good system of checks and balances.

Problems, however, do stem when maintainers blind approve (rubber stamp) a coworkers PR because of a deadline (re:invent). I've already seen this happen more than once.

msfroh commented 7 months ago

If maintainers and other community stakeholders are involved in developing the ideas in the first place...

This is great in theory but not always the case. I don't expect the community to be involved in everything. That just leads to micromanagement. I'm okay with a company paying people to write and upstream a feature or change to core and I don't expect to be involved in that process.

That's fair and that was poorly worded on my part. People can (and should) push their own ideas -- but preferably incrementally, recognizing that there may changes along the way.

If someone proposes an idea, as "Hey, I'm thinking of doing a thing" or even "Hey, I've already done a thing in my day job and my bosses are cool with me open sourcing it", and they get no feedback one way or the other, then I see no issue with taking it as a lazy consensus and moving forward. On the other hand, it also doesn't guarantee that the community will be comfortable with taking ownership of a big, hairy code change going forward.

Ownership, in my opinion, is the crux of the underlying problem. I think the big tech corporate world has a very different ownership model (where I'm using "ownership" essentially as a synonym for "responsibility") relative to open source.

As has been well-documented in books over the years, Amazon engineering is mostly made up of 2-pizza teams that "own" their software from end to end. You are arguably empowered to build what you want, but you also accept that if it breaks at 3am, you're going to be waking up to fix it. While that instills a certain level of responsibility, as someone who's been working in it for 12 years, I feel like it also permits taking risks since it's your risk. Similarly, if you cut some corners and take on some tech debt, it's your tech debt. (Or at worst, it's your team's risk/debt.) Nobody is going to stop you.

If we turn around and look at an open source project like OpenSearch, I feel like merging a PR means accepting it on behalf of a broader community. People you don't know will be downloading and installing this software. Developers outside your team and your company need to work in the codebase that you've contributed to for better or worse. I feel like maintainers should be thinking about what's best for the project, the contributors, and the users, which means scrutinizing and pushing back or at least asking questions about a given PR.

What scares me about the idea of an architecture decision record is that it feels like a club that a developer (frankly, an Amazon developer, since this really doesn't feel community-oriented) could wield to say, "Look -- there's a record that my design was approved by someone. You must accept it or the contract has been violated."

kiranprakash154 commented 7 months ago

Thanks for the comments, I think I clearly understand everyone's points of view, that is - to keep the barrier of entry as low as possible and flexible and to keep the process non-enterprise like.

andrross commented 7 months ago

I'm in agreement with @nknize, @msfroh, and @reta here and don't have a whole lot to add to the feedback they've already given.

Benefits Enhanced Collaboration: Inline commenting facilitates detailed, focused discussions, improving the quality of feedback . and engagement. Centralized Decision Record: Housing ADRs within the project repository ensures that architectural decisions are easily accessible and linked to their implementation. Streamlined Process: Utilizing the existing GitHub infrastructure simplifies the process, avoiding the need for additional tools and keeping the project's documentation cohesive.

@kiranprakash154 The benefits you listed do point to a real problem, or at least areas for improvement. I think we have agreement that Architecture Decisions Records are not the right solution to the problem, and you've gotten some feedback that the problem statement as you described may not be quite the right framing. But please don't get completely discouraged at pursuing this issue! I think improving and streamlining collaboration in the open and doing better at documenting the code and architecture of the project are all things we can improve!

kiranprakash154 commented 7 months ago

Thanks @andrross ! Yep, my main problem areas to solve was for in-line commenting and for all the architectural discussions to be retained in the repo and not in the silos of other 3rd party tools like Google Docs, Word, Quip etc.

But everyone here made valid comments on ADR increasing the barrier of entry and would add an enterprise like setup which is not well suited for an opensource community.

msfroh commented 7 months ago

I would like to echo @andrross's sentiments. Thank you @kiranprakash154 for raising this issue!

While I disagree with the specific approach proposed, I really appreciate you raising the issue and IMO it's been a good way to stir up what I think has been a productive conversation.

peternied commented 6 months ago

@kiranprakash154 Thanks for spending the time to right this up. For your consideration, while as OpenSearch - ADRs don't seem like they will become the new way from this issue - you can always use your template to build a feature and see what people think. Maybe after seeing your demonstration you'll pickup more advocates or refine its goals.


In case this sparks something in what you've been thinking, I built a proof of concept feature with a design doc inside it [1]. While it didn't really get the kind of review discussion I had idealized. It did work well to introduction the feature as part of a synchronous meeting.


I'm always going to suggest more mermaid diagrams in pull request / issues. Don't like learning mermaid? Use your favorite GenAI to make a diagram and tweak!

kiranprakash154 commented 6 months ago

I like what you did there @peternied, was your plan to merge the PR with that .md file or just keep that open only for PR reviews ?

kiranprakash154 commented 6 months ago

@msfroh @reta @nknize I think we need to enable Github Discussions in the Opensearch repo. Discussions seems to be built ground up for RFC and collaboration.

https://resources.github.com/devops/process/planning/discussions/

reta commented 6 months ago

@kiranprakash154 I think we discussed this a few times already but there are some hesitations since we have https://forum.opensearch.org/ already. Why we need both?

kiranprakash154 commented 6 months ago

To be honest, I was unaware of the existence of forum.opensearch.org. Given that Discussions is integrated with GitHub, I anticipate it might attract more participation. However, if we have thoroughly evaluated Discussions and deliberately chosen not to utilize it, I'm fine with that decision.

peternied commented 6 months ago

I like what you did there @peternied, was your plan to merge the PR with that .md file or just keep that open only for PR reviews ?

@kiranprakash154 Thanks! I did not merge that md file in the final PR. Unfortunately design documentation can quickly go out of date and leave behind more questions than answers if it seems like its providing guidance on what the product current state. Where I have merged design documents is more in the architectural space of a repository such as in the Security Plugin [1] where architectural changes will be rare and require much deliberation that justifies the time and effort to maintain.

peternied commented 6 months ago

[Triage - attendees 1 2 3 4 5] @kiranprakash154 It looks like this issue was reopened to discuss enabling GitHub Discussion, we already have the forums - I am closing out this issue. Please feel free to create a new issue for other topics in this space.