libp2p / go-libp2p

libp2p implementation in Go
MIT License
5.96k stars 1.05k forks source link

Proposal: consolidating issue tracking from 50+ repos to top-level repo #676

Closed raulk closed 1 year ago

raulk commented 5 years ago

Context and problem statement

  1. The libp2p core team is looking to improve our workflows and tooling for work structuring, planning and project management. 🙌

  2. The result will be increased transparency, accountability, clarity, diligence, towards (and within) the libp2p community and ecosystem. 🔎

  3. We're evaluating Zenhub. It's an overlay on top of Github, is OSS-friendly, and allows everyone to view the pipelines/workspaces publicly.

  4. Whichever tool we use, in needs to be snappy, otherwise we’ll all get frustrated quickly, adherence will plummet, and we’ll abandon it. 🏃💨

  5. Unfortunately, go-libp2p issue tracking is scattered across 50+ inner repos, which makes tooling extremely slow, and in some cases outright impracticable if they impose hard limits (like GitHub Projects – max. 25).

  6. In general, this creates friction and makes the project hard to approach and navigate. 🌊

    • To file an issue, you need to know in which repo to file it first. This harms UX.
    • In some cases, people file issues in go-libp2p, in other cases in the right repo. This creates ambiguity. Now an issue can be in two different places.
    • Searching is difficult; people frequently have to resort to Google (including me).
  7. Eventually, we'd also like to track work for cross-implementation epics (e.g. multistream 2.0, NAT hole punching, etc.) in a single workspace easily. This will enable the core team to drive alignment across Go, JS, Rust, Python, jvm, cpp, etc. implementations more efficaciously. 🚢

What we're going to do

*Consolidation: We're planning to run an experiment to consolidate all issue tracking for `go-libp2p-inner repos under the top-level project (go-libp2p`).**

Labelling: We'll adopt a well-organised and clear labelling taxonomy to categorise issues. Inspiration: Kubernetes, Rust.

Migration: The “transfer issue” feature of GitHub has now graduated from beta, so migrating issues from inner repos to their top-level counterparts should be straightforward. Alas, it's not available via the API for a batch migration, so we'll have to do this manually ⛏

Sunsetting the child issue trackers: We can then (a) disable the issue tracker in inner repos, or (b) keep it open with a pinned issue serving as a NOTICE forwarding to the appropriate top-level repo. I prefer (b) because it provides better navigation and less surprise. If we find users keep disregarding the NOTICE and opening unwanted issues, we can automate the transfer to the appropriate top-level repo via a bot.

Email notifications: The only concrete "regression" that has been pointed out to me is that some people only subscribe GitHub notifications for specific repos.

Closing issues via keywords cross-repository: supported. See https://help.github.com/en/articles/closing-issues-using-keywords#closing-an-issue-in-a-different-repository.


By a show of emoji, please signal what you think about revamping our issue management in the manner outlined here.

If you’re opposed on any level, please refer to arguments rooted in evidence, facts and projections, and offer an alternative solution. Mere preference signalling is counterproductive here. Solving the issue/task overview and unification stands in the critical path of bringing more order, structure and clarity to libp2p project/product management (both technical and non-technical).

marten-seemann commented 5 years ago

Before commenting on the issue of consolidating the issues, which I will call the issue-mono-repo for this discussion, I'd like to ask the question that is underpinning this whole discussion.

Why do we keep code in separate repositories?

We do that because we believe that each of those repositories is a separate entity of code, that, while it can be used in conjunction with other libp2p packages, can also be used on its own (or with a small subset of other packages).

I'm aware that the question of separate repos vs. mono-repos is a discussion that is probably almost as old as software engineering itself, and has been discussed a bunch of times inside of Protocol Labs. People tend to have very strong opinions in either direction. The main reason for a mono-repo seems to be that it helps developer (and user) usability to have all the code in one place. The main argument for separate repos is that it makes sense to split software into small, independent sub-parts, that can be used, tested and improved upon indecently of the rest of the code base. I can see the arguments in both direction, and for the time being, I'm feeling agnostic towards this question. I'm happy to go with whatever we decide is best for libp2p.

Setting aside this controversial discussion, I believe that issues should live where the code lives:

The counter-argument to this of course is that there are users who're using libp2p as a whole, and that we can't expect them to dig through our code base to find the correct repository to open issue in. I have a lot of sympathy for this argument, but don't think this requires us to consolidate all issues under go-libp2p. Instead, we need to create and communicate a workflow that encourages users to report the issues. One way that comes to mind is telling users that it's ok to report issues at go-libp2p if they don't know the root cause of the problem. We can then use Github's new "transfer issue" feature to transfer this issue to the correct place (note that this is no more overhead than correctly categorizing and tagging the issue, which would be necessary in the issue-mono-repo proposal).

Thoughts on the Notification Problem

Aside from these conceptual considerations, subscribing and unsubscribing to individual repositories is one of the main features for me. For the most part, I'm currently only interested in the notifications of go-libp2p-quic-transport and go-libp2p-tls, and am loosely following what's going on in some other repositories. I unsubscribe from notifications from other repos as soon as they create too much noise in my inbox and distract me from focusing on my work. Creating an issue-mono-repo would leave me with no choice but to unsubscribe from that repo in order to escape the flood of notifications that are irrelevant for my work. This would also cut me off from notifications from the two repositories that I am actually maintaining, and where I'm trying to respond to issues, review PRs etc. in a timely manner.

fabioberger commented 5 years ago

Another small drawback is that you won't be able to use Github's handy fixes: keyword in PR descriptions and have it automatically close the issue on merge.

raulk commented 5 years ago

@fabioberger actually, GitHub does support cross-repository keyword triggers: https://help.github.com/en/articles/closing-issues-using-keywords#closing-an-issue-in-a-different-repository. I should’ve noted it in the body! Will edit.

yusefnapora commented 5 years ago

I like @marten-seemann's suggestion to direct new users to file issues to go-libp2p and then move them to the "leaf repos". But I don't think it addresses @raulk's motivation for the proposal, which (as I read it) is about getting a "bird's eye" or project-level view of issues across repos / modules. Since the tools available seem to fall over with a large number of repos, getting the bird's eye view seems to require consolidation...

I agree that having separate code repos with an issue-mono-repo seems like a bit of an awkward hybrid setup. Are we viewing this proposal as a kind of "trial balloon" for consolidation, to see if a mono-repo might work for code as well? I'm kind of into the mono-repo idea personally, but I definitely see the other view and see why it's contentious.

raulk commented 5 years ago

🚨 I specifically wanted to stay away from the mono-repo debate. 🚨

This discussion is about task/issue/bug/feature management, and facilitating workflows as a team. Code architecture will remain as-is within the scope of this debate. We tend to get very philosophical when it comes to modularity. And with good reason: modularity and composability are key principles of libp2p that will stay intact. However, this issue is about practicality of management, prioritisation, and workflows.

@marten-seemann, just a short overarching remark though. I believe you’re mixing modularity/composability/pluggability and component independence/“standalone-ness”.

Modularity and composability can be realised with a monorepo — have a look at Apache Camel as an example. It contains 200+ modules, all of which are released at the same time (they are not independent), and the user only depends on core + the modules they pick. Same with “baptised” Linux kernel modules. Modularity and composability are about APIs, not about physical code layout.

Conversely, none of the go-libp2p-* components are truly independent/standalone. They cannot be used outside of libp2p. They are designed to be plugged into libp2p via composition. Even the higher level protocols, e.g. Kad DHT, pubsub, etc. depend on all of the libp2p machinery. Despite hard-depending only core abstractions, those abstractions have to be fulfilled at runtime by the rest of the libp2p stack.

Once again, this is not the place for the mono-repo discussion, but I did want to pull some strands apart that tend to get interweaved, at times tangling discussions that hit architectural topics even if tangentially.

marten-seemann commented 5 years ago

@yusefnapora Yeah, that's right, I don't know a lot about third-party Github tools around, so I can't comment on the features and shortcomings of any particular tools out there. I've never used Waffle, and in fact, I'm quite happy that it's gone now, because I found the auto-assignment of issues and PRs quite distracting, since it caused a bunch of email notifications that I couldn't unsubscribe from.

I realize that this probably doesn't apply to all people working on libp2p, but I for my part am happy with the Github tooling as it is, and am not planning to adopt any new tools in my workflow. As I'm unable to suggest any alternative tools for people who're unhappy with what Github is providing, I'm probably not the right one to say this, but to me it feels a bit weird to change the way we're organizing our issues in response to the shortcomings of one particular third-party tool.

@raulk I fully agree, and my intention is not to restart the mono-repo discussion here. What I tried to bring across in my previous post is that code organization and issue organization are not orthogonal problems, and to me it makes little sense to have a multi-repo for code and a mono-repo for issues (or vice versa).

raulk commented 5 years ago

We need tooling that works for our community, ecosystem, stakeholders, engineers and users, and that allows us to:

These elements are just as much part of making the libp2p project successful as is the code itself.

We are pampered by GitHub automatically creating a issue tracker for each repo. But for the better part of history, large projects were not managed co-locating code with ticket/issue management (Linux, Chromium, Firefox, etc).

I believe this Github default is biasing people, wrongly making us believe that code needs to be colocated with the management tools. It is not the case.

I’m open to other options, but it needs facilitate technical and project management. Keeping things as they are does not. Just look at the amount of times we’ve tried to triage the massive backlog and failed in the attempt.

yusefnapora commented 5 years ago

Just wanted to go on record that I'm for the proposal, btw. I'd rather have one project-level view that's possible to filter than many separate views that are (practically) impossible to aggregate. As long as it's not JIRA, I can definitely live with it 😄

raulk commented 5 years ago

Lol. The outlook of adopting JIRA is a great forcing function here.

@marten-seemann: I do think you bring a legitimate use case for any new management workflow: email notifications for scoped contributors. I’m thinking we can trivially put together a bot that pings specific users when issues are labelled with their labels of interest.

lanzafame commented 5 years ago

So I have read your initial post @raulk and to myself and correct me if I miss the mark, but there is two separate problems that are attempting to be solved by the consolidation of issues into one repo:

  1. Users reporting issues don't know where they need to create the issue because of all the repos. (user)
  2. Current tooling doesn't handle consolidating the hundreds of repos we have. (maintainer)

Both of these could be solved by consolidating issues into a single repo but I personally think this calls for two separate tools. There are other tools out there other than Zenhub that meet the requirements of the maintainer, i.e. https://github.com/marketplace/azure-boards, and I am sure there are more. As such, I think a decent tool analysis should be done before we undertake a change in how our issue repos are structured.

And there a other solutions to the single entry point problem for users reporting issues like the many bug reporting/support tools that are out there.

My gut instinct on this entire issue is that GitHub gives us a hammer aka a repo, and makes us try to do mental gymnastics to see everything as a nail. I agree that what you outline is a problem, two problems to specific, and I believe we need to do some more research to determine whether there are no other options available to us before taking the mono-issue-repo route.

EDIT: there is actually a third group, which is open source contributors who are very used to the GitHub model of open source (aka the colocation of issues and code). Changing that is a breaking change for them and their expectations. I don't mind doing this but it is something that we should keep in mind.

raulk commented 5 years ago

@lanzafame

Both of these could be solved by consolidating issues into a single repo but I personally think this calls for two separate tools.

Whichever tool we use, it needs to be a seamless, snappy, lightweight, opt-in overlay on top of GitHub. GitHub is our source of truth, and the team shuns duplicating work across tools, clunky integrations, or questionable UIs. Zenhub does a pretty good job here.

There are other tools out there other than Zenhub that meet the requirements of the the maintainer, i.e. github.com/marketplace/azure-boards

  1. Who is the maintainer in this context? Most maintainers at PL are favourable to this approach; users are ACK'ing in this issue too. So far only @marten-seemann has pointed out a regression that affects his workflow, which is not a blocker and totally reconcilable in various ways.

  2. Aside from Zenhub, I've analysed these options: JIRA, Trello, Clubhouse, Asana. They don't work for various reasons which aren't relevant now.

i.e. github.com/marketplace/azure-boards

How does this compare to Zenhub, especially in terms of repo limitations/speed?

lanzafame commented 5 years ago

Whichever tool we use, it needs to be a seamless, snappy, lightweight, opt-in overlay on top of GitHub. GitHub is our source of truth, and the team shuns duplicating work across tools, clunky integrations, or questionable UIs. Zenhub does a pretty good job here.

I am not disagreeing with any of these, except that for Zenhub to remain 'snappy' it requires that we consolidate all repos issues into a single repo, which to me suggests that it is not fit for purpose.

Most maintainers at PL are favourable to this approach; users are ACK'ing in this issue too.

My comment was neither a for or against but that by splitting the problem we may be able to find tooling that supports both usecases without the upheaval to the projects issue trackers.

How does this compare to Zenhub, especially in terms of repo limitations/speed?

I can't test the speed due to lack of privileges but it allows a 100 repos to be connected to the one project board. But me suggesting this wasn't a surefire solution but quick google for a tool that supported many repos.

My main point, is that I think a proper requirements and tool analysis should be done before making such drastic changes to so many repos. If there is nothing that meets the needs of the different stakeholders than I am all for this.

marten-seemann commented 5 years ago

@raulk My main point was not the issue with my workflow. Even if the notification issue was resolved, I'd still be opposed to the mono-issue-repo proposal. By avoiding the discussion about keeping our code in a mono-repo, in my opinion, we end up with the worst of both worlds in terms of architecture: we still have the dependency graph / code testing issues of the multi-repo approach, while at the same time giving up on the modularity that multi-repos are supposed to provide us.

raulk commented 5 years ago

Thanks for your input @marten-seemann @lanzafame. This discussion can go on indefinitely and we risk falling into analysis paralysis. So in the interest of making progress, I'll make my final remarks and move on.

lanzafame commented 5 years ago

I'm not sure where the assumption that we haven't comes from

Fairly simple, it wasn't communicated that you had...

So in the interest of making progress, I'll make my final remarks and move on.

@raulk Not sure why this is a proposal, it should just be an announcement as you have already made a decision. 👍

EDIT: I should mention that I don't really care if you have made the decision already, if you think it is best for everyone involved in the project, then go for it but don't make out as if there was any chance of swaying the decision, it just leads to frustration.

raulk commented 5 years ago

(self-quote) So in the interest of making progress, I'll make my final remarks and move on.

@lanzafame My bad; that came out a bit abrupt. What I intended to say is that I heard your arguments and I believe I addressed them on various fronts, up to the point that we'd just be going around in circles unproductively, and the majority in the libp2p team at PL and on this issue is favourable to revamping our workflows and tooling. So I much rather focus on making forward progress at this point, because this is time-critical for the libp2p team.

Since your concern was basically "have we evaluated other tools?" and the answer was yes, I'll just make my notes public:

lanzafame commented 5 years ago

ZenHub: overlay on GitHub (GH is source of truth); label management; Chrome extension; Kanban or other workflows; useful reporting; used by lots of big names; free for OSS; multiple workspaces.

Great, why does it require a mono-issue-repo?

Since your concern was basically "have we evaluated other tools?"

No! My concern is the creation of a mono-issue-repo. If the tooling is forcing that choice, then lets look at other tooling, hence the "have we evaluated other tools?".

raulk commented 5 years ago

It’s all stated in the description of this issue and follow-up comments.

ghost commented 5 years ago

Continuing https://github.com/libp2p/go-libp2p/issues/676#issuecomment-511406624 👍


On the main topic:

I'm a +1 on moving forward with *something* even if we don't have 100% agreement on it. The points made in this discussion seem reasonable to me, but ultimately we're going to have to pick an approach and some people are going to be unhappy or need to modify their workflows. It'll be for the greater good, though.

Warchant commented 5 years ago

Consider also https://zube.io - it has multirepo projects

raulk commented 5 years ago

Consider also zube.io - it has multirepo projects

Most of the tools we evaluated support multirepo projects (including ZenHub). They just grind to a halt when adding all of our repos, or they have maximum caps.

BigLep commented 1 year ago

@p-shahi and @marten-seemann : can this be closed now in light of finish the monorepo work per https://github.com/libp2p/go-libp2p/issues/1556 ?

p-shahi commented 1 year ago

Sounds good, thank you for spotting this stale issue.