all-contributors / cli

Tool to help automate adding contributor acknowledgements according to the all-contributors specification ✨
https://allcontributors.org
Other
414 stars 145 forks source link

Fetch or auto-discover contributors (auto-generate) #117

Open mrchief opened 5 years ago

mrchief commented 5 years ago

Problem description:

Currently, add let's you manually add contributors. However, for existing projects, it may be tedious and error prone to follow this manual approach; thereby creating a high barrier of entry for anyone who wants to start implementing the spec.

Suggested solution:

It'd be great to have a way of auto-discovering this information. E.g.

Github's contributors API can help discover the first item; while issues API can take care of the remaining two.

If there is interest in adding this, I can help send a PR.

kentcdodds commented 5 years ago

Unless I'm mistaken, you can use check for this

mrchief commented 5 years ago

Can check discover the ones not listed in the rc file?

Berkmann18 commented 5 years ago

@mrchief check will compare what you have in your .all-contributorsrc file with the list of contributors on the repo and then it will tell you who's missing.

As far as I'm aware there's no way to add multiple people in one go (@jakebolam ?) but it's an interesting feature to have.

mrchief commented 5 years ago

@Berkmann18 Right, that was my understanding too. Adding multiple people (either manually or via discovery) lowers the bar of entry for existing packages. In a fairly active repo, adding everyone one by one manually can be daunting (first time setup).

Berkmann18 commented 5 years ago

@mrchief It's perfectly understandable, would you like to submit a PR for this?

mrchief commented 5 years ago

Sure I'll see what I can do. Progress may be a bit slow as I'm swamped at work right now.

Berkmann18 commented 5 years ago

@mrchief No problem.

mrchief commented 5 years ago

@Berkmann18 Any preference on how to structure this? I was thinking either:

And then surface them like other commands. Sound about right?

mrchief commented 5 years ago

Also, I see that you already have getContributors which is used in check flow. I think this is what @kentcdodds was alluding to?

If so, I can use it in the new discover command to get the list of code committers. And then all that is left is to fetch issue list, filter out PR authors and you'll have your bug contributors.

Am I missing anything?

jakebolam commented 5 years ago

Yes I believe so!

You may also start running into rate limiting #121 #53. We may want to consider something like #69 (and the recommending the #122) for large projects first setup.

jakebolam commented 5 years ago

Any ideas on how the flow would work for the bot? https://github.com/all-contributors/all-contributors-bot

mrchief commented 5 years ago

@jakebolam Yeah, I'm well aware of rate limiting as I have run into it in the past. I'm using the same approach as getContributorsPage so I guess that should handle it. In fact, it's almost the same function for the most part (and so there is an opportunity to DRY it up a bit). We can discuss more after I push my code, it'll be easier that way.

As far as the bot goes, I think it can leverage the same discovery features and can be activated with something like please add all new contributors. Frankly, I wasn't aware of the bot until today so I'm not all too familiar with it.

jakebolam commented 5 years ago

Sounds good to me!

Yes that's a great idea for the bot 👍

Definitely challenging to auto-setup for all contributors. But this would be a great base where projects can start from and branch out into.

Berkmann18 commented 5 years ago

@mrchief

or add a new folder, say, discover and then add the new methods in github.js (and later gitlab.js`)

I think that would be better but I'm open to the other option.

@jakebolam We could always try this out on the CLI for packages with existing contributors and when it works fine, the bot could get this functionality so we could add this in a regression style (considering the bot might have more factors in play).

Related all-contributors#18.

mrchief commented 5 years ago

or add a new folder, say, discover and then add the new methods in github.js (and later gitlab.js`)

@Berkmann18, that was my first approach too. But in that case, I see a lot of code duplication between different files (basically request setup and paging logic will be duped).

I settled on repo/github.js as it felt that should be single place to do anything with the repo. In future, it could be refactored to be its own folder with different files (for various API endpoints) and all. Just thinking out loud here.

Berkmann18 commented 5 years ago

@Berkmann18, that was my first approach too. But in that case, I see a lot of code duplication between different files (basically request setup and paging logic will be duped).

That can be dealt with refactoring and a well-structured codebase so it shouldn't be an issue.

mrchief commented 5 years ago

If only github had a way to keep track of issues I'm contributing to... This got dropped from my radar:)

I've been swamped lately and haven't made any progress on this beyond the first few iterations. I'll try to get back to this as soon as possible (hopefully that'll be weeks and not months).

Berkmann18 commented 5 years ago

@mrchief You can still get access to the ones you read you know? Anyway, nice to see you back at it.

mrchief commented 5 years ago

Yes, it's a tedious process checking them and making sure I didn't miss any commitments. I didn't have this problem (of keeping track) until now so didn't put much thought into it. I've started a todo list now. :)

mrchief commented 5 years ago

@Berkmann18 @jakebolam I got something working. I ran a crude test against all-contributors/all-contributors-bot repo (which has these nice labels):

image

and I got this:

image

I'm gonna tidy things up a bit and a send a PR soon.

protoEvangelion commented 4 years ago

@mrchief can you commit the code you used to achieve this so we can see 😻

protoEvangelion commented 4 years ago

For those who would like a work around, I hacked a workflow together:

To add every one who committed code in one swoop:

npx name-your-contributors --wipe-cache --full -u user -r repo > combined-out.json

Grab that list and save it to a file file.txt with format:

username1 code
username2 code

Then run: cat file.txt | xargs -I % sh -c 'all-contributors add %;'

To get users who opened bug reports, I used Ocktokit:

    const data = await GitHub.paginate(
       GitHub.search.issuesAndPullRequests.endpoint(payload)
    )

    const users = data
            .map(({ user }) => {
                return user.login
            })
            .filter((v, i, a) => a.indexOf(v) === i)

    console.log('users', users)

and added them to a text file like and run like above

username1 bug
username2 bug

Add multiple users and contribution types at once

Add them to the text file with comma separated contribution types and run like above

username1 bug,code,security
username2 bug,doc
Berkmann18 commented 4 years ago

@protoEvangelion That looks nice, the problem with this is that there's no auto-fetching and is limited to bug/code contributions.

protoEvangelion commented 4 years ago

True! It's definitely not ideal ;) Works decent if you label issues well though.

mrchief commented 4 years ago

@protoEvangelion This hasn't escaped my mind. I've been awfully busy lately. I'll try to send it across soon.

ericclemmons commented 4 years ago

I ended up copy/pasting the output of yarn all-contributors check into a file (e.g. add-contributors.sh), split on ,, then made each line read like:

# add-contributors.sh
yarn all-contributors name1 code
yarn all-contributors name2 code

Then running:

bash add-contributors.sh

Rate limits apply, so using PRIVATE_TOKEN is necessary:

https://allcontributors.org/docs/en/cli/usage#github-users

mrchief commented 4 years ago

@protoEvangelion Been a while so took me a while to piece things together. So it seems I did create a PR for this https://github.com/all-contributors/all-contributors-cli/pull/184 which got superseded by the work @Berkmann18 was doing in #196.

In case you're interested, my code lives here: https://github.com/mrchief/all-contributors-cli/tree/feat/discover.

@Berkmann18 it seems like it's been a while since this got any movement. Do you want me to pick this up again?

Berkmann18 commented 4 years ago

@mrchief Yes, it's been a while due to a variety of stuff including one issue (#187) which blocked progress and I haven't got around to resolving that (hopefully the break would have been helpful). As I mentioned to @protoEvangelion, any help is welcome as I was pretty much the only one looking after most of the AC repos lately (+ other projects).

I'll go through the code again tomorrow (or on the weekend) and try to get the PR moving.

In the meantime, I'll be more than happy if anyone would look into what we have so far.

mrchief commented 4 years ago

@Berkmann18 Sounds good. I can look it over this weekend. Could you add me to dl branch?

Berkmann18 commented 4 years ago

@jakebolam @kentcdodds Could either of you do that? I don't seem to have access to the settings for this repo.

@mrchief In the meantime, I've added you to a fork so you can work on it as soon as you can.

jdalrymple commented 4 years ago

For those who would like a work around, I hacked a workflow together:

To add every one who committed code in one swoop:

npx name-your-contributors --wipe-cache --full -u user -r repo > combined-out.json

Grab that list and save it to a file file.txt with format:

username1 code
username2 code

Then run: cat file.txt | xargs -I % sh -c 'all-contributors add %;'

To get users who opened bug reports, I used Ocktokit:

    const data = await GitHub.paginate(
       GitHub.search.issuesAndPullRequests.endpoint(payload)
    )

    const users = data
            .map(({ user }) => {
                return user.login
            })
            .filter((v, i, a) => a.indexOf(v) === i)

    console.log('users', users)

and added them to a text file like and run like above

username1 bug
username2 bug

Add multiple users and contribution types at once

Add them to the text file with comma separated contribution types and run like above

username1 bug,code,security
username2 bug,doc

Would be nice if we could convert the output from name-your-contributors to all-contributors format which some basic config to map contribution types :'(

Berkmann18 commented 4 years ago

@jdalrymple That's essentially automatically done in #196 but I haven't had the time to troubleshoot some issues.

jdalrymple commented 4 years ago

@Berkmann18 Could you add the issues you are still having in the PR. Id be happy to take a look :smile:

EDIT: is this the problem?

For some reason the data read by fs.readFileSync(configPath, 'utf-8') in ./util/config-file.js sometimes ends up being 0 which screws up the adding process.

Berkmann18 commented 4 years ago

@jdalrymple

EDIT: is this the problem?

For some reason the data read by fs.readFileSync(configPath, 'utf-8') in ./util/config-file.js sometimes ends up being 0 which screws up the adding process.

Yup.

jdalrymple commented 4 years ago

Ill see what i can figure out :angel:

JoshuaKGoldberg commented 1 year ago

If this project is still maintained, I'd be happy to donate the code in https://github.com/JoshuaKGoldberg/all-contributors-for-repository for this feature. ❤️

Berkmann18 commented 1 year ago

@JoshuaKGoldberg It still is, although most of the maintainers (at least myself and Greg) have been busy with life commitments (families, work and such). I don't know if you read all of the comments and saw the WIP PR I made (which was improved by fellow coders); it's a shame I've not had the chance to retest the solution but hopefully, I'll get around to do that and see if it's decent enough to merge (I'll rather make sure the model is reliable enough then have the feature but the assigned categories aren't always correct). If you want to join forces or have a better solution than what I came up with then I'll be happy to consider it and get this long-overdue feature released.

JoshuaKGoldberg commented 1 year ago

Heh, such is life in open source... if you all would like some help, I'd be happy to pitch in on this summer! This project is an excellent point for the industry and I really appreciate all the work you all have put into it! ❤️

To be honest I mostly skimmed the PR and then forgot about it 😅. I'm also about to go off for part-work, part-vacation travel for 3 weeks so probably can't effectively collaborate on it till mid-June... but if you're still free then, I'd love to pitch in however would be helpful!

Berkmann18 commented 1 year ago

I recently got back. I went through https://github.com/all-contributors/cli/pull/196/files (the PR I mentioned) and the diagram I made outlining how all the contribution categories could be recognised AllContributorsCategoryClassification drawio

Although I'm currently more active due to an injury that has prevented me from doing what sponsored players would do (like competing and practising), I'll try to block time every week for AC repos (or GH projects in general) so I should be more active from now on.

Here are the contribution categories found by PR (with notes on whether it's found by the AI model so not fully accurate yet) and what your repo has (note: the package I used in the PR for the non-AI stuff doesn't get labels from issues/PRs so more of the AI-suggested categories could be more accurately fetched if we get labels) Category Handled by the PR Handled by all-contributors-for-repository
audio
a11y Not quite
bug Yes (AI) Yes
blog
business
code Yes Yes
content
data Not quite
doc Yes (AI) Yes
design Yes (AI)
example Not quite
eventOrganizing
financial
fundingFinding
ideas Yes (AI)
infra Yes (AI) Yes
maintenance Yes (AI) Yes
mentoring
platform Yes (AI)
plugin Yes (AI)
projectManagement
promotion
question Yes (AI)
research
review Yes Yes
security Yes (AI)
tool Yes (AI) Yes
translation
test Yes (AI) Yes
tutorial
talk
userTesting
video

Another note: all the categories found by the AI model may not be based on enough data fed to the model (cf. https://github.com/all-contributors/ac-learn/issues/37 for more info) or ones that could reliably be assigned based on repo data alone. And the PR I have doesn't fully take advantage of what NYC returns.

Berkmann18 commented 1 year ago

Just an update on the feature. I've got something that works (there's room for improvement, especially with the contributions assigned to users, but I don't want to delay this feature more just because of that) on #196.

@jdalrymple @JoshuaKGoldberg @mrchief Feel free to review the PR and test locally with some repos and let me know if you see any glaring issues. Apologies for the massive delay on this.