Rust-for-Linux / linux

Adding support for the Rust language to the Linux kernel.
https://rust-for-linux.com
Other
3.92k stars 417 forks source link

Automate Kernel list email from GitHub CI #1107

Open nyurik opened 2 weeks ago

nyurik commented 2 weeks ago

I was just browsing #1106, and realized once again that Kernel email list is by far the hardest part of joining this amazing project. Being a dev brings up an obvious question: can this be automated?

I would like to propose some (yet to be decided) automated method of emailing to the kernel list using some github actions magic specifically set up for this repo. This way it will keep kernel devs happy, while also attracting new rust talent to this project without requiring what seems as insurmountable (paperwork) barrier of entry.

Ideas welcome.

workingjubilee commented 2 weeks ago

Note that GitGitGadget implements effectively-this for the Git project, so there is a model to follow if someone wanted to implement this (and assuming devs were receptive).

tgross35 commented 2 weeks ago

GitGitGadget does look really interesting. Figured it doesn't hurt to ask so I mentioned the idea of enabling it for regular kernel development at https://github.com/gitgitgadget/gitgitgadget/issues/1695.

I think the philosophical reason why the kernel doesn't use better tools is because GH is more "push new commits when you make changes, squash at the end" and LKML is definitely "always keep your commits atomic, resend the whole patchset when you make changes". Which isn't impossible on GitHub, it just does a less nice job keeping track of things across force pushes. I think that tools like Gerrit or Phabricator are better designed for this, but that's probably worse than mail flow.

If you are interested in getting involved in any way, I think it's easiest to just subscribe to the list but automatically filter messages, so you get everything and can reply to it but don't get 20 notifications per day. To subscribe you just send an email to majordomo@vger.kernel.org with the text subscribe rust-for-linux looks like that's deprecated, use https://subspace.kernel.org/vger.kernel.org.html. Then I have a filter like to:(rust-for-linux@vger.kernel.org -your.email@foo.com) that just archives everything unless I'm explicitly mentioned.

Reviews of everything are always welcome. It's easiest to find something on the archives https://lore.kernel.org/rust-for-linux/ then search it in your client and reply from there. Just reply all, enable plain text mode (it's in the triple dots of the compose box if you use gmail), add your responses inline (so type immediately below the relevant bit rather than at the default top of the email), and snip irrelevant sections.

^ I know this is still comparatively a lot of overhead and isn't at all what is being asked, but it's also not that bad to just get started with the default flow if you have an expectation for knowing what to do. Sending your own patches is another thing but that is a bridge that we can help you cross whenever it comes up (basically have to use git send-email rather than anything from this century, but it works reasonably well).

PRs are also absolutely welcome here to get some initial review. They just won't get picked up unless they go through the list - but again, we can help here when needed. Also there is Zulip if you have any kind of questions https://rust-for-linux.zulipchat.com/#narrow/stream/293929-Announcements/topic/LWN.20articles.20and.20posts.

nyurik commented 2 weeks ago

@tgross35 thx for the nice write up! I think sending a PR is the main issue, not the subscribing/monitoring bit. Going from a regular "submit PR from my fork" workflow to a obscure git send-email was a surprisingly high entry barrier when I tried to make a trivial change. Thus, if some-magically-how a PR gets automatically converted to an email, it seems like we can get the best of both worlds:

bjorn3 commented 2 weeks ago

if PR is updated with a change, github action squashes all changes together and sends a single patch (possibly using the PR's description as the comment)

That will not work if your PR contains changes that should be split into multiple PRs.

I believe gitgitgadget allows you to freely push to yout PR branch and only sends an email when you explicitly ask the bot, giving you the ability to push your changes as new commits and only squash right before you are done with changing things and want to send another revision.

nyurik commented 2 weeks ago

thx @bjorn3 - that could also work - as long as the email sending is automated (e.g. with a trigger comment), I feel the new developers could be on-boarded much faster. Moreover, it looks like PRs are never actually merge-closed here anyway, so the workflow could be:

ojeda commented 2 weeks ago

In the past, when we were out-of-tree, we did development in GitHub because it was convenient for what we were doing at the time. However, now we are in-tree and, for better or worse, Linux uses an email/patch-based workflow, so it is best to follow that workflow.

A one- or two-way bridge would especially help contributors that only want/need to send a couple small patches here and there. It is not the first time it has been discussed (as well as using forges in general), both inside Rust for Linux and in the kernel community in general, so it may happen eventually. Nowadays, I recommend using B4 (https://b4.docs.kernel.org), maintained by the kernel.org team, which simplifies some of the technicalities and offers an option for those without SMTP access.

For other contributors, i.e. active kernel developers, they would need to learn the actual workflow to get involved with other subsystems, maintainers, lists, trees, their rules, etc. For Rust in particular, there are some patches that only pertain to the Rust subsystem, but Rust is a kernel-wide effort, and thus in many cases one needs to interact with other subsystems anyway. We have some more details at https://rust-for-linux.com/contributing#the-kernel-development-process and https://rust-for-linux.com/contributing#the-rust-subsystem.

In order to get accustomed to the patch-based workflow, from time to time we add "good first issues" here.

fbq commented 1 week ago
  • once there are no more feedback, either the author or a maintainer adds a magical comment to the PR to auto-push it to the mailing lists

This is something b4 can do it for an individual, why we need some infrastructure to do is a bit questionable.

  • [optional] bot could monitor mailing list and post relevant replies directly to PR as individual comments

How should contributors respond those feedbacks from the list? Another GitHub comment? That means another round of syncing, which may complicate the magical system proposed here.

Overall, I'm not 100% sure, PR + merge workflow is better than email workflow in every aspect. If the main workflow in Linux kernel is still email-based, then it makes more sense to spend time on helping newcomers get familiar with that workflow because if they are looking into a long-term contribution, that will be a necessary skill.

workingjubilee commented 1 week ago

I'm gonna be honest, looking at b4 briefly, I don't see the advantage of "this arcane CLI tool" over "another arcane CLI tool", so what's the actual advantage?

fbq commented 1 week ago

I'm gonna be honest, looking at b4 briefly, I don't see the advantage of "this arcane CLI tool" over "another arcane CLI tool", so what's the actual advantage?

b4 has the web endpoint feature that doesn't require you to send a SMTP cli locally.

fbq commented 1 week ago

Maybe you could list some pain points for you in the email work flow.

nyurik commented 1 week ago

I think the disagreement is not about the specific workflow, but about the size of the entry barrier for the gen-Gs - "the GitHub generation", i.e. how many volunteers will be deterred from participating because of unfamiliar workflow.

We tend to evaluate complexity as related to ourselves, but this is not a good metric. I see 895 PRs at torvalds/linux and 784 PRs in this repo. Most PRs are closed without merging. This is an insanely large number of PRs that likely were too small to warrant learning the workflow, but combined would have considerably improved the codebase. Maintainers might have better idea of all these PRs though.

In software, we create compatibility layers to support older systems and APIs - because compatibility layer is cheaper than to re-writing it. With volunteers, it is the same thing - you cannot expect to re-educate insanely large gen-G population to use unfamiliar workflow. At best, you will educate a few, while the vast majority will go elsewhere.

So if the conversion rate is low (as I suspect it is), and the maintainer average age keeps growing, I think there is a problem for the sustainability of the project. Granted that this can be counter-balanced with large cash infusions, i.e. people being paid to work on it, but the cost will continue increasing until it goes into a Cobol maintenance mode... :)

fbq commented 1 week ago

I think the disagreement is not about the specific workflow, but about the size of the entry barrier for the gen-Gs - "the GitHub generation", i.e. how many volunteers will be deterred from participating because of unfamiliar workflow.

We tend to evaluate complexity as related to ourselves, but this is not a good metric. I see 895 PRs at torvalds/linux and 784 PRs in this repo. Most PRs are closed without merging. This is an insanely large number of PRs that likely were too small to warrant learning the workflow, but combined would have considerably improved the codebase. Maintainers might have better idea of all these PRs though.

So I looked in the 895 PRs in torvalds/linux (in a sampling way), and seems most of them are one commit PR. Let's we lost ~1000 commits since year 2011. However, the commits between Linux v6.9 to v6.10 are 14561 commits, and that's roughly just two months, so although I would feel bad if we lost a single talent, but in term of commit numbers, the impact of these PRs seems unobservable based on this metric.

In software, we create compatibility layers to support older systems and APIs - because compatibility layer is cheaper than to re-writing it. With volunteers, it is the same thing - you cannot expect to re-educate insanely large gen-G population to use unfamiliar workflow. At best, you will educate a few, while the vast majority will go elsewhere.

Not sure I can agree on this.

I'm not here to say that "the old way is always better", it is really:

So if the conversion rate is low (as I suspect it is), and the maintainer average age keeps growing, I think there is a problem for the sustainability of the project. Granted that this can be counter-balanced with large cash infusions, i.e. people being paid to work on it, but the cost will continue increasing until it goes into a Cobol maintenance mode... :)

ojeda commented 1 week ago

Most PRs are closed without merging.

This is false, at least for Rust for Linux. You are probably looking at the first few pages. Those PRs were closed because they were applied as patches, via the usual patch workflow, not via GitHub. For the older pages, when we used GitHub, most PRs are in fact merged.

In summary, what one can see in GitHub PRs has little to do with the actual development that is going on nowadays in mainline. Please see https://rust-for-linux.com/contributing.

vincenzopalazzo commented 1 week ago

This is false, at least for Rust for Linux. You are probably looking at the first few pages. Those PRs were closed because they were applied as patches, via the usual patch workflow, not via GitHub. For the older pages, when we used GitHub, most PRs were merged.

Confirming it, I personally helped to upstream on the ML some PR that are open on Github, and also @ojeda ensured that all Linux guideline was followed.