nodejs / github-bot

@nodejs-github-bot's heart and soul
MIT License
262 stars 112 forks source link

Modernizing the project #264

Open mmarchini opened 4 years ago

mmarchini commented 4 years ago

First of, I love our GitHub bot. It's an essential piece of our infrastructure, and it can fill the gaps where GitHub Actions don't work well (the whole read-only, no secrets situation with forked PRs). With that being said, I have been trying to contribute to the project and the entry barrier seems somewhat high:

1) Getting the development environment working on my machine took between one and two hours 2) There's no way (afaik) to properly test Jenkins hooks 3) Some of the dependencies we use are deprecate (github, request, there might be others too) 1) github is not only deprecated, it lacks documentation. To find out which method to use, developers need to look into the TypeScript definition and then figure out which GitHub API is being called to read the documentation on https://developer.github.com/v3/ 4) Some parts of the code have hardcoded nodejs as org and node as repository, which makes it impossible for users to test without finding these conditionals and changing them/commenting those out

The project is also heavily callback-oriented, some parts would probably benefit from changing to an async/await-oriented implementation (I know this is more controversial, so I won't push too much on it).

Based on the points above, I have some suggestions to modernize the project:

1) To improve the development experience, we could set up our own relay, which would essentially be a multiplexer receiving webhook calls from nodejs/node-auto-test. Anyone who wants to collaborate would connect to that relay, and they would receive the appropriate tokens needed for testing. We could add checks so that only certain teams are allowed to request access. We can also have a separate user with permissions limited to nodejs/node-auto-test, to prevent folks from mistakenly affecting other repositories. The same relay could be used for Jenkins hooks. 2) To avoid hardcoded repositories and orgs, we could move some of the logic to the repositories. For example, labeling is only enabled on nodejs/node, if we moved the labels definition to nodejs/node (so that the bot loads the definition when needed), it would be easier for collaborators to include/remove definitions for new files, and it would also allow the bot to detect if that repository supports labeling or not. 3) Replacing deprecated dependencies and potentially moving more towards async/await API might be harder. We could either start a new branch from scratch, or try to modernize different pieces individually. Either way it will need some coordination efforts as well as a lot of work to get it working properly.

As an alternative, we could turn the github-bot into an Actions relay: it would receive events from GitHub and Jenkins, and would forward those events to the repository dispatch API. This way, we could define everything as Actions on the respective repositories, circumventing the gaps with Actions on forked PRs and being able to define Actions for Jenkins events.

What do folks think? cc @nodejs/github-bot

mmarchini commented 4 years ago

FYI I'm trying the alternative approach on a personal repository with a local Jenkins instance. Proof of concept will be comment on pr -> triggers Jenkins CI -> publish status, with all the logic implemented in Actions. I'll report back if it works and how complex the workflow and implementations are.

phillipj commented 4 years ago

Hooray! Thanks for getting involved, sharing your initial pain and suggestions 👍

No doubt the github-bot age has started to become obvious, and lots has changed since its inception, e.g. GitHub Actions like you mention. And the pain of setting it up and even grasping how it works, hasn't been great for new comers, hence the low activity in the project.

To be honest, I've questioned the actual value of the github-bot as is, when we now got GitHub Actions at our disposal. Not having an express server running on a server somewhere, would it in self be a success in my eyes. But I've mostly used Actions for trivial things like running tests and whatnot, so its shortcomings that you're referring, isn't something I'm familiar with yet.

Worth mentioning that I'm not opinionated about keeping the github-bot as is, refactor or even replaced. As long as we find an approach that serves the collaborator community what it needs, I'm all in.

In short I'm +1 to all your three concrete suggestions.

Regarding your point about deprecated dependencies and callback vs async/await, https://github.com/nodejs/github-bot/pull/258 is relevant. Those changes are driven by the need to update one of the deprecated packages you mention, and it introduces Promises | async/await throughout a lot of the code base which you third suggestion is all about.

Your suggestion of transitioning into an Action relay sounds really interesting and isn't something that has struck my mind! I like the idea of moving stuff into the projects themselfs, instead of centralising it into this github-bot project. From a glance it sounds like something that could ease the pain of understanding how things work, tho maybe a bit harder to test properly? Having automated tests verifying things still works changes are pushed, has been one of the pros by having the project as is IMO.

What's your gut feeling and favourite amongst the suggestions you're proposing?

mmarchini commented 4 years ago

My gut feeling is that if we can make everything work with actions, and we don't start hitting Actions limits all the time, it should be easier to maintain and it would be easier for folks to contribute.

Testing is a good point, although some of the features (labeling for example) are not crucial enough IMO for tests to outweigh the lower contributors poll. Actions are testable too, so we could test the ones that are more critical to our workflow.

phillipj commented 4 years ago

Testing is a good point, although some of the features (labeling for example) are not crucial enough IMO for tests to outweigh the lower contributors poll.

Fair point. Also sure we'd find a creative way to perform some kind of automated tests if we really need to, meaning things blow up more often than we'd like too..

As said in #258, those changes are planned to land ASAP unless anyone objects.

But say we want to go in the Actions relay direction long term, and you don't find any obvious showstoppers in the proof of concept you're working on, will you let us know when others can join the fun and contribute? Would be cool to hear your thoughts on the way forward; new endpoint in the existing github-bot project (followed by deleting lots of code) or a new project altogether etc?

mmarchini commented 4 years ago

As soon as I get the proof of concept working I'll share it on nodejs/build. For now I'm rewriting it from scratch because the relay idea doesn't require much on our side (a super simple, with no authentication relay of Jenkins to GitHub is less than 50 LOC, and token-based authentication with a third-party module is probably only a couple lines only). The flow I have in mind on our server side is accept requests -> validate authentication token -> send a repository dispatch call to GitHub with appropriate payload (usually just redirect the payload it received, this will trigger a repository_dispatch action on the target repository with the payload we sent, and from there we can perform any tasks we want.

phillipj commented 4 years ago

Cool, looking forward to seeing it in practise!

phillipj commented 3 years ago

In our quest of delegating more to the actual repositories w/GitHub Actions and experimenting with making the bot an Actions relay, it has also dawned on me that our current infrastructure situation feels overkill and clunky.

Although running it on the Node.js org infrastructure works, there's definitively a bus factor involved and it's not trivial to fully grasp. Maintenance is also time consuming and rarely prioritised.

Except for the attempt-backport feature, that I've suggested be removed in https://github.com/nodejs/github-bot/pull/288, there's little reason for the github-bot to have a dedicated host and in practise be a full fledged webserver.

From a distance it feels like something more nimble like Netlify functions and the like, would have made quite a difference and significantly reduced the bus factor. As always, the grass always feels greener on the other side, but I'm having trouble seeing the obvious downsides to something like that compared to what we've got at the moment.

mmarchini commented 3 years ago

As long as the bot doesn't have access to security releases, I think that's fine? It's something that should be brought up to nodejs/build, if we move it should be to a Node.js Netlify (or w/e service we decide to use) account.

phillipj commented 3 years ago

Myeah, sounds like a good idea 👍