Background Check Project for GSOC

itaditya commented 6 years ago

Idea - Each time someone comments on something in a GitHub repository, the github app checks if the user is new to the project. If they are, the bot loads the most recent ~100 comments and runs a sentiment analysis on it. If any of the comments stand out as aggressive then create an issue in a private repository or a discussion for a configurable team to point the maintainers to the new user’s comment and their comments on other projects that might show that they have been hostile before.

I would like to work on this idea as part of GSOC 2018

gr2m commented 6 years ago

Thanks, I meant to make an issue for that 👍

SamithDB commented 6 years ago

Interesting Idea... I also like to work on this.. @itaditya @gr2m

copperwiring commented 6 years ago

Hi @gr2m and @itaditya. I would like to claim this issue as my project direction for RGSoC 2018. I have familiarity with GitHub webhooks and have always wanted to build on top of it.

gr2m commented 6 years ago

@SamithDB we reserve the issues with the Summer of Code label for participants in Google Summer of Code & Rails Girls Summer of Code. Everyone can use it for their application. Of course you can work on it, just don’t share any code if possible until April when the application deadline has passed.

@copperwiring good luck :) If your team gets accepted we’d be happy to work on it with you during the summer.

iamgrawal commented 6 years ago

@gr2m Please do tell me, Which sentimental analysis libraries are recommended to use for this project idea?

itaditya commented 6 years ago

@gr2m I was thinking we can maintain a list of users who have commented on the org's repos along with their sentiment scores. Now when a new user comments on any issue. We check if he has shown any aggressive behaviour in the past. If he has, we open an issue in a private repo for the maintainers to discuss how to interact with that user and also store his aggressive comments in the db for future reference.

Is this fine or am I missing something ?

itaditya commented 6 years ago

Also if the aggressive user comments again, the bot will comment in the existing private repo's issue thread about what the user has commented now.

j-f1 commented 6 years ago

@itaditya I think the idea for the bot was to look through the user’s comments on other repos to see how they’ve behaved in the past.

itaditya commented 6 years ago

@j-f1 yes, you are right, I'm looking at the past comments to see if he is agressive, but I wanted to discuss what to do after that.

See -

We check if he has shown any aggressive behaviour in the past

j-f1 commented 6 years ago

Sorry about that. I misread your comment. Your ideas sound good 👍

itaditya commented 6 years ago

Thanks @j-f1

itaditya commented 6 years ago

@gr2m I was thinking we should focus on providing the service to orgs only. An individual repo owner won't have enough time to look into how to reply to each user.

gr2m commented 6 years ago

To clarify, I thought that GitHub apps could not create repositories on user accounts. But if that works it’s cool, I’d just require it to be installed on the entire account/org, not just for selected repositories

dessant commented 6 years ago

This may be interesting, but creating an aggregate score of recent comments may be better instead of If any of the comments stand out as aggressive then create an issue in a private repository.

Though part of me disagrees with the entire premise of warning maintainers about past actions despite the lack of current negative behavior. I'd consider judging their current activity at face value to be healthier than encouraging the creation of negative bias towards new users.

we can maintain a list of users who have commented on the org's repos along with their sentiment scores. Now when a new user comments on any issue. We check if he has shown any aggressive behaviour in the past.

store his aggressive comments in the db for future reference.

Case building is again not something we should encourage and normalize, nor maintaining user scores. Checking the score of recent comments when a toxic one is made could be useful as part of a maintainer notification.

There is an excellent bot for toxic comments, perhaps that could be extended to notify maintainers if such a comment is made, and support a temporary automated block (until a human reviews the activity) of users whom have made a new toxic comment in the same discussion thread after they were already warned once. https://probot.github.io/apps/sentiment-bot/ https://github.com/behaviorbot/sentiment-bot/issues/6

Probot
Sentiment Bot
Replies to toxic comments with a maintainer designated reply and a link to the repo's code of conduct

abhijeetps commented 5 years ago

Hi there! This idea has been implemented by @itaditya here: https://github.com/probot/background-check You can install the application here: https://github.com/apps/background-check

GitHub
probot/background-check
A GitHub App built with probot that peforms a "background check" to identify users who have been toxic in the past, and shares their toxic activity in the maintainer’s repo. - probot/back...

GitHub
Build software better, together
GitHub is where people build software. More than 28 million people use GitHub to discover, fork, and contribute to over 85 million projects.

itaditya commented 5 years ago

Hi @dessant as mentioned by @abhijeetps I have built a bot for background check. Right now it starts the whole process whenever the repo has a new commenter.

I think you are suggesting that we should instead only start the background finding process once a person makes a toxic comment. I have mixed feelings about this. If we go ahead with what you suggest then.

We won't judge a person before they do something wrong in our community. Pros- A person might have done something wrong in the past. But they may have changed now. Cons- We will be giving a chance for a person who has written toxic comments in the past to do the same in our community.

For these reasons the github app doesn't take any decision itself. The job of the github app is to better inform the maintainers of the community. It's upto them to examine the matter. This way we have a human factor included in the process. Maintainers can discuss among themselves. They can contact the person also. There is also the case that the github app wrongly flags a comment toxic. So maintainer review is very essential.

App will start the process only when person makes toxic comment. Pros- App can bail early if the new commenter's comment is not toxic.

I think the app as of now doesn't promote Case Building. It is there for providing more info to maintainers.

dessant commented 5 years ago

@itaditya, I believe it encourages negative bias by dragging a user's history to any new community they join.

The one Pro you list for the current state of things, I don't see how that's an advantage. The damage has already been done once you let maintainers preemptively learn about the user's history.

I've talked to about 5 people over the past few months about this, and showed them the app and how it works. Their first reaction was mostly a comparison to social scoring systems.

gr2m commented 5 years ago

As a project maintainer, I see it as my responsibility to protect the community against hostile people. I don’t think a person’s contribution in one project can be separated from their behaviour in other projects. The app written by @itaditya is a useful tool in a maintainer’s belt to help keeping a community a safe and welcoming place. It’s only informing a maintainer about aggressive comments in the past, it’s still up to me to act upon it or not.

@dessant I welcome your reservation though, I gave it quite some thought myself and also reached out to other folks with the question if this bot could be used to harm people in any way. It wasn’t conclusive so I decided I’ll have to give it a try first in some of my projects.

Did you give it a try yourself?

dessant commented 5 years ago

I did not install it for the communities I maintain, because I do not agree with the premise that I should be aware or take action about someone's behavior in different places, just because they have interacted with my project. Once we're made aware of an unflattering comment from someone's past, we may view their future behavior through a different lens, no matter how unbiased we try to be.

@itaditya has summarized the idea nicely in "We won't judge a person before they do something wrong in our community".

A background check can be useful when someone is causing trouble, but doing it preemptively seems missguided.

I see no value in this data being delivered to maintainers, because it would be unreasonable to take any kind of action against someone who is not causing trouble in a community. It seems like a useless distraction and it makes maintainers more trigger happy.

It's also not a good sign when there is no conclusive answer about a GitHub app harming people or not, though maybe during its use it will be easier to indentify what makes people wary about it.

gr2m commented 5 years ago

We won't judge a person before they do something wrong in our community

That has been a big problem in many projects in the past, one that I think we fail to address in our communities. It’s the same as saying "Well XYZ has always been nice to us" if someone reports being harassed by XYZ in another environment. There a lots of bad actors in our communities that are still being celebrated on many occasions because of the do great work they do. I know a lot of people who left projects because of that, usually people that are not white guys like us.

This app helps to protect people that are more likely to be harassed than you and I. It's not perfect but it sure is a great step forward to make our communities more safe and inclusive.

dessant commented 5 years ago

If someone reports being harassed anywhere, they should not be dismissed with "Well XYZ has always been nice to us". Still, an online community can be kept healthy without maintaining behavioral scores for users.

itaditya commented 5 years ago

Hi @dessant, I really appreciate the input that you have given.

Regarding the app being a social scoring system. There are minor details to the app that make it different.

Only the latest 100 comments are analysed so it is not that we are collecting their entire history.
The data which is analysed is public. The commenter willingly provided that data for the entire world to see. It's not the case of analysing private messages or user activity to target anyone.
A human is treated as a human and not like a data point in a giant database. The responsibility of taking action is left to the community maintainers. If they just take the analysis score and without proper thought build a case against a person. Then at some degree it's their fault and regardless of their usage of the app, the community they maintain will be affected.
The app doesn't display all the public comments of a person. Only the comments that are toxic are shown along with very little metadata. Also the analysis is kept private and only visible to maintainers. It is not the intention to shame someone for their past comments.

probot / ideas

Background Check Project for GSOC #43