github / contributors

GitHub Action that given an organization or repository, produces information about the contributors over the specified time period.
https://github.blog/2023-10-23-how-to-gain-insight-into-your-project-contributors/
MIT License
91 stars 13 forks source link

Contribution count in Org report is not limited to date range #143

Closed kontu closed 2 weeks ago

kontu commented 2 months ago

Describe the bug

When generating the report against an Organization using start_date and end_date, the Contributions Count is equal to the all time contributions for any repos in the report

In a report created, Top contributor has contributed to one repo in the time span. See screenshots for I would expect their Contribution Count to be ~8 in the report; not 476

To Reproduce

permissions: contents: read

jobs: contributor_report: name: contributor report runs-on: ubuntu-latest permissions: issues: write

steps:
  - name: Get dates for last week
    shell: bash
    run: |
      # Calculate the first day of the previous week
      start_date=$(date -d "last-sunday" +%Y-%m-%d)

      # Calculate the last day of the previous week
      end_date=$(date -d "$start_date +1 week -1 day" +%Y-%m-%d)

      #Set an environment variable with the date range
      echo "START_DATE=$start_date" >> "$GITHUB_ENV"
      echo "START_DATE=$start_date"
      echo "END_DATE=$end_date" >> "$GITHUB_ENV"
      echo "END_DATE=$end_date"

  - name: Run contributor action
    uses: github/contributors@v1.4.3
    env:
      GH_APP_ID: ${{ secrets.GH_APP_ID }}
      GH_APP_INSTALLATION_ID: ${{ secrets.GH_APP_INSTALLATION_ID }}
      GH_APP_PRIVATE_KEY: ${{ secrets.GH_APP_PRIVATE_KEY }}
      START_DATE: ${{ env.START_DATE }}
      END_DATE: ${{ env.END_DATE }}
      ORGANIZATION: "XXXX"
      SPONSOR_INFO: "false"
  - name: Create issue
    uses: peter-evans/create-issue-from-file@v5
    with:
      title: Contributor report for ${{ env.START_DATE }} - ${{ env.END_DATE }}
      content-filepath: ./contributors.md
      assignees: xxxxxxxxx


### Expected behavior

- User has contributed in the date range
- Contribution Count reflects contributions performed only during the date range

### Screenshots

Generated report for 1 week: ![image](https://github.com/github/contributors/assets/5705697/a1a1063b-36ca-4016-9435-a8ce3f118feb)
![image](https://github.com/github/contributors/assets/5705697/2a43f7ac-bc4b-4be8-b1de-5eb93b503f7d)

### Additional context

Referencing reports in this repo to narrow down when this changed
- https://github.com/github/contributors/issues/56 seems to correctly only count contributions from the time period
- https://github.com/github/contributors/issues/61 is the first report I find in this repo that shows total contribution count instead of time period
- This makes me suspect it may have been introduced around release [v1.1.2](https://github.com/github/contributors/releases/tag/v1.1.2) ; however as i have installed as github app I can not downgrade to test at this time
zkoppert commented 3 weeks ago

I believe what you are seeing here is that there are things that count as contributions that you wouldn't expect. ie. more than code commits.

Here is the github api endpoint we are using. Notice in the api response example it says "contributions: 32". That is the number we are relaying into this action.

The value of contribution has a list of things that increase that count.

  • Committing to a repository's default branch or gh-pages branch
  • Creating a branch
  • Opening an issue
  • Opening a discussion
  • Answering a discussion
  • Proposing a pull request
  • Submitting a pull request review
kontu commented 3 weeks ago

I do not believe what I am seeing is the difference. In my example pic showing 476 contributions as the top contributor, it is impossible that these other items make up the difference. That repo is rarely used, only has 2 people who commit to it. Rarely PR's, never issues or discussions or other items. Certainly not 468 other actions in that week timespan defined. The user has made roughly ~450 commits to the repo since it's inception some years back though, which is more inline to the value returned.

zkoppert commented 3 weeks ago

Hmm interesting! Let me take a look a few other places to see what I can find.

zkoppert commented 3 weeks ago

I don't see anywhere in the code where the contributions are filtered down by the dates given.

contributor = contributor_stats.ContributorStats( user.login, False, user.avatar_url, user.contributions_count, commit_url, "", )

And user.contributions_count is set by the GitHub Python api wrapper directly without considering start and end date variables.

Our options are either to switch to commit counts or relabel the data to all time contributions.

kontu commented 2 weeks ago

Looks awesome can't wait to see it in a release!

zkoppert commented 2 weeks ago

Just released!!

kontu commented 2 weeks ago

image

Ran using @1.5.0, doesn't seem to have fixed the issue quite. I should have under 20 contributions; and 528 matches the result from action contributors@1.4.3

zkoppert commented 2 weeks ago

Yeah, we changed the column headers to reflect that these are all time commits. We couldn't fix the commit numbers due to limitations in the GitHub API.