jstrieb / github-stats

Better GitHub statistics images for your profile, with stats from private repos too
GNU General Public License v3.0
2.97k stars 624 forks source link

Considering only commits made by myself #101

Open zaaarf opened 1 year ago

zaaarf commented 1 year ago

Would it be possible? Specifically, only considering commits where the commit author is me. To my understanding, this currently does its maths on all full repositories, but I'm in a number of orgs that work in languages I'm absolutely unfamiliar with, so it's not really a fair representation of my actual skill. And yet, to manually ignore the orgs or repos would also cut out a significant part of stuff I actually did.

I'd implement this myself, but despite what the stats on my profile would have you believe, I'm not that proficient with Python.

(I'm aware what I'm using on my profile is a fork of this, but the problem exists on this one as well)

jstrieb commented 1 year ago

Hey @zaaarf, thanks for checking out the project!

Yours is a reasonable question, and something I've thought a bit about. The short answer is that it seems to be more trouble than it's worth.

It's already fairly annoying to get a list of all commits to a repo associated with a user (complicated by the fact that many users' Git commit emails don't correspond to their GitHub accounts). Extracting meaningful data about the files changed in each commit on top of that is significant additional complexity. It could probably be done by just using the GitHub API to get a list of repos, cloning them individually, checking out the relevant commits associated with a user, and using the GitHub Linguist tool to guess at languages. But to me, that's more trouble than it's worth for data of dubious accuracy.

All that to say that it's definitely possible, just outside the scope of this little pair of scripts.

zaaarf commented 1 year ago

Hey @zaaarf, thanks for checking out the project!

Yours is a reasonable question, and something I've thought a bit about. The short answer is that it seems to be more trouble than it's worth.

It's already fairly annoying to get a list of all commits to a repo associated with a user (complicated by the fact that many users' Git commit emails don't correspond to their GitHub accounts). Extracting meaningful data about the files changed in each commit on top of that is significant additional complexity. It could probably be done by just using the GitHub API to get a list of repos, cloning them individually, checking out the relevant commits associated with a user, and using the GitHub Linguist tool to guess at languages. But to me, that's more trouble than it's worth for data of dubious accuracy.

All that to say that it's definitely possible, just outside the scope of this little pair of scripts.

I see. Would it be more doable, instead, to only count repositories where I have authored at least one commit? It still counts the whole repository, but it automatically filters out those I haven't worked in.