expressjs / express

Fast, unopinionated, minimalist web framework for node.
https://expressjs.com
MIT License
64.66k stars 15.28k forks source link

feature: add a GitHub action to quell spam PRs #5449

Closed CBID2 closed 5 months ago

CBID2 commented 6 months ago

Problem

I was scrolling through Twitter and someone posted about the spam PR.

Possible Solution

Implement this GitHub action. It'll lessen the workload for maintainers.

wesleytodd commented 6 months ago

@ADTC is any of this on a public link? I don't have an Instagram account but would be interested in reading links if they are available. Screenshots just don't seem to provide enough context to interpret what she is saying in that post.

ADTC commented 6 months ago

@wesleytodd there isn't much on the channel related to this topic. She posted a copy of the tweet, made her comment and started the poll. That's all.

Here's the screenshot with the tweet.

You can't access the channel without an Instagram account. Here's a link though: https://ig.me/j/AbagxilRJ7KwLyaD/

PS: last ditch effort would be to create a new repo and lock this repo as read only. Post a notice sign posting the new repo.

wesleytodd commented 6 months ago

Thanks for the help, we reached out via more formal methods and have not heard anything back afaik so just wanted to check if this had any added context on the situation.

I am going to take a pass pretty soon on this thread to mark a lot of the comments in here as "off topic", so please don't take offense, but this is all really off topic conversation.

nicholasgriffintn commented 6 months ago

Really, I think you could probably just reject anything with "update: ''" as a PR title, I don't think that would be too disruptive for normal PRs.

SakuraBlossomTree commented 6 months ago

Maybe we can add a PR template when people want to issue a PR

It is probably as this people won't be able to know how to change basic things in the PR template

ADTC commented 6 months ago

Most of them can be auto-closed if they meet all of the below criteria:

  1. PR has only one commit.
  2. Commit changes only one file.
  3. Commit has a message subject matching regex Update [^ ]*\.[^ ]+ (match is for all file names generally).
  4. Commit has no message body.

This should auto-close almost all of them. Maybe 1 or 2 every month may not match, but it's much less work to manually check and close them.

PS: The regex match should be on commit subject, not PR title because the newbie spammers are more likely to change the title.

For fun: Close them with a cheeky comment: Congrats! You are now an open source spam contributor. Now learn how to make real contributions and your training will be complete. [Add a Hindi translation of the same.]

Kamleshpaul commented 6 months ago

I think we can get auther (who made PR) total lifetime PR count if it is more then 10 or whatever we can prevent that ?

like this

name: Spam Detect

on:
  pull_request:
    types:
      - opened

jobs:
  check_author_pr_count:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v2

    - name: Set up Node.js
      uses: actions/setup-node@v4
      with:
        node-version: '14'

    - name: Install dependencies
      run: npm install github-api

    - name: Count Merged PRs
      id: pr_count
      run: |
        AUTHOR=$(curl -sSL https://api.github.com/repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }} | jq -r '.user.login')

        MERGED_PR_COUNT=$(curl -sSL -H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" "https://api.github.com/search/issues?q=is:pr+author:$AUTHOR+is:merged" | jq -r '.total_count')

        echo "merged_pr_count=$MERGED_PR_COUNT" >> $GITHUB_OUTPUT

    - name: Check PR Count
      run: |
        if [[ "${{ steps.pr_count.outputs.merged_pr_count }}" -gt 20 ]]; then
          echo "PR count is greater than 20. Proceed with the workflow."
        else
          echo "PR count is not greater than 20. Blocking the PR."
          exit 1  
        fi

this is it's test https://github.com/Kamleshpaul/github-spam-action-test/pulls

SakuraBlossomTree commented 6 months ago

I think we can get auther (who made PR) total lifetime PR count if it is more then 10 or whatever we can prevent that ?

They already made a github action which takes care of this

ServerDeveloper9447 commented 6 months ago

Well, it looks like the issue will be solved when the pr is merged

wesleytodd commented 6 months ago

Hey folks, just to update you: We have a thread open with GitHub support where they are passing along our requests to the various product teams on how to improve the moderation features. I personally am still 👎 on an action which auto closes PRs as it is both a maintenance burden and something best handled by features on GitHub's side. We are focused right now on landing some governance changes and will be following up with re-constituting the triage team. I think ideally this decision is left up to the triage team members. So likely there will not be action on this or the PR until we can get that team organized again. If you are interested in helping that please follow the instructions to get involved there (as some of you already have, thank you very much for that).

dougwilson commented 6 months ago

They have slowed down too for the time being, the last one already is two days old.

nkroker commented 6 months ago

unless there is an action which does more than that one (like spam detection with ai or something)

Not sure about AI spam detection, but almost all of these PRs update the readme, have default name (Update filename), change only a single line and have no description, so it should be rather easy to close them automatically.

Most authors of these PRs have a repsitory called "localrepo", so that's another rule that could be used to detect spam from this source (Apna College).

The flood of pull requests (PRs) occurred due to individuals learning about Git and GitHub from a specific YouTube video

https://youtu.be/Ez8F0nW6S-w?t=4323

In the video, the presenter demonstrated how to contribute to open-source projects using the Express repository as an example. While the video emphasized creating PRs for meaningful changes only, the majority of the audience consisted of beginners unfamiliar with GitHub and community guidelines. Consequently, many blindly followed the steps outlined in the tutorial, resulting in an overwhelming influx of PRs.

wesleytodd commented 6 months ago

Thanks @nkroker, please read above as that topic has been covered quite a few times. I will mark this as off topic but thanks for the well intentioned help.

jonchurch commented 5 months ago

Closing as not planned.

Many words have been written now in this issue and others about auto-closing spam PRs, and we have come to the same conclusion every time. It is trivial to implement, but is a sledgehammer approach for what is now a mosquitto level annoyance. Closing and locking a handful of spam issues/PRs now and then is the easiest thing a maitainer will be doing that day, so it doesn't save much by labor.

There is very little benefit in having an auto-close action. It will still spam people's Github notifications no matter if it's closed and locked at the speed of a human or a machine. The only thing we'd prevent there is people commenting disparagingly on the PR before it gets locked by a human, generating more notifications. We haven't had any of that in the past couple weeks (we did get some hostile comments when this topic was trending originally, which it no longer is).

The PRs are still coming, but intermittently. The storm has passed, this can be reevaluated if another flood happens in the future from the previous source or any other.

Thank you all for your eagerness to help. If you want an ear to the ground on things that the project is prioritizing and possibly looking for help with, peruse the Discussions repo https://github.com/expressjs/discussions/issues

not7cd commented 4 months ago

This can be reconsidered with the use of label-action. By labelling those PRs this action can create a meaningful message to the spammer to inform them why it's getting closed.

See an example from Renovate:

  1. configuration
  2. in action
wesleytodd commented 4 months ago

Hey @not7cd,

I love the idea of "create a meaningful message to the spammer to inform them why it's getting closed". Would you be willing to sign up to own this and help participate in the project to maintain the tooling? The main thing here is that closing a few spam PRs is a simple but short term thing. Working on setting up automation means time taken away from much more important project priorities we have right now. So it is mostly a prioritization thing.

So far the folks who showed up to make noise about ways to "solve" this problem have not showed up to reply with this sort of context in the spam PRs. If anyone in the threads wants to start posting these prewritten messages to the authors then we (the maintainers) can stop with our current approach of closing them as spam. This would save us a lot of time and help those users learn, but it is work and we are volunteers.

For everyone in this thread, please we need your help:

  1. Start helping those users by posting nice responses telling them why this is bad
  2. Start participating in the project, which could mean following activity repos and replying to issues (that is the road to being a triager) or opening PRs with helpful things