PHPCSStandards / PHPCSExtra

A collection of code standards for use with PHP_CodeSniffer
GNU Lesser General Public License v3.0
90 stars 8 forks source link

NonInclusiveLanguage sniff #59

Open jrfnl opened 4 years ago

jrfnl commented 4 years ago

A sniff to examine code and comments for the use of non-inclusive language and throw a warning when found.

Specifically, the sniff should look for sexist, racist, ablist or ethnocentric language, which can contribute to a hostile work environment.

Initial word list

Search for Alternatives to suggest Notes
whitelist, blacklist allowlist/safelist/acceptlist, denylist/blocklist/rejectlist
master, slave primary/main, secondary/replica
he, she, him, her, his, himself, herself they, them, their, themself may need to limit this search to comments
crazy peculiar, baffling
dummy placeholder

Input requested and very welcome !!!

Particularly on:

What to examine:

Search for these in:

For constructs, report on these only when the construct is declared, not when used, as usage cannot be changed until the declaration has been changed.

Additional notes:

External references:

joemcgill commented 4 years ago

This is a great initiative. Thank you for taking it up! I've found this list of disability terms with negative connotations a helpful resource.

jrfnl commented 4 years ago

@joemcgill Thanks, though the credit should also go to @jdevalk.

Thanks for the link. I've had a look through the list, but there are only a few words there which I can imagine people would ever use in a code-context, but maybe I'm wrong ? Please tell me if I am !

The only ones which sprung out at me from that list (other than those already listed above) were:

And possibly

Specific words to search for with suggestions for alternatives are most helpful to get this off the ground.

@vavroom would you care to comment ?

vavroom commented 4 years ago

Very happy y'all are looking at this kind of thing :)

On the term disabled, I wouldn't be too worried. While for a very long time there's been a push to use "person with a disability", instead of "disabled", there's also been a massive push for using just disabled, by disabled folks. If you look on twitter for the #SayTheWord hashtag, you'll get a feel for it.

Also, the idea of ableist language is using a medical/disability related word in a negative context. A disabled button is pushing that envelope a bit. I'd not be too worried about it. Then again, I would hesitate to use disabled buttons but that's a story for another day :D

@jrfnl points out correctly that dimming the screen is very different from calling someone dim. Again, I wouldn't worry about it.

I'd be curious to know what blind folks think of "double-blind testing". I personally don't view it as objectionable, but then I'm not the target market of that kind of potentially ableist language.

jrfnl commented 4 years ago

@vavroom Thanks for taking the time to give feedback. Much appreciated.

maccath commented 4 years ago

This is a great initiative!

I was also wondering about the terms disabled/enabled.

It feels unnecessary when there are terms like inactive/off/deactivated/restricted that do the job just as well... But I'm not disabled, so I don't think I can speak with any authority. Thanks for your input @vavroom

tomjn commented 4 years ago

I think there should be room to add words that don't have suggested replacements, brazenly outright innapropriate words, such as the N word, or other derogatory terms, such as calling people with downs syndrome the M word, or the P word.

ChrisWiegman commented 4 years ago

Really glad to see this. Thank you!

benlk commented 4 years ago

This is a good idea!

In code there is no alternative to that string, much like when spellcheck trips over referer. Even limiting the scope of the sniff for these terms to comments will probably cause a tiring number of false positives.

jrfnl commented 4 years ago

@tomjn Good idea and those words which really shouldn't be used, should probably be an error. I'd be very surprised to ever come across those in code in the first place, but you're right: may as well check for them.

Is it ok if I approach you privately to verify that I interpret the letters you mention correctly ? Or ping you to review the sniff to make sure I have added the right ones ?

tomjn commented 4 years ago

Sure, but it's hardly an exhaustive list, and the P/M words might be more used in the UK than internationally. Happy to review

benlk commented 4 years ago

If the sniffer is going to sniff for a list of naughty words, https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words aims to have comprehensive lists.

maccath commented 4 years ago

I think there's a difference between 'naughty' words and exclusionary language. For example, I can see a bunch of anatomical and sex related words on those lists which aren't necessarily used in a demeaning and derogatory way; it's less clear cut - and could end up being exclusionary in and of itself.

jdevalk commented 4 years ago

Honestly we should probably make that sniff a separate issue and just go for the main goal here.

Jurigag commented 4 years ago

I will add my honest opinion that will get a lot of downvotes i guess - people and their behaviors are racist, ethnocentric, abilist etc, not a words themselves used in totally other context.

But yea, if it won't be enabled by default using this package or included in core ever then why not? Someone wants to use then feel free i guess. Otherwise idea is pretty cool because i understand that whitelist/blacklist is not really great naming, allowlist/denylist is much more self explanatory - but creating it as NonInclusiveLanguage and due to racist is just wrong - because those words are not racist themselves, people using them in wrong context are.

You could combine word black with many other nouns - which if said in wrong context can be racist and offended as well, not only blacklist. If we are really going this way we should ban whole black word with combination of anything else, just to make sure that's its bulletproof for future.

At this moment i am offended by this issue and description of it because of this:

Specifically, the sniff should look for sexist, racist, ablist or ethnocentric language, which can contribute to a hostile work environment.

So you are telling that if i currently use blacklist/whitelist i am racist and you suggest that i can have hostile work environment? Idea for this sniff is great, but explanation why it's needed is wrong. This description should be changed honestly.

vavroom commented 4 years ago

@Jurigag The thing with racist words, or ableist words, is that it's about the people who are on the receiving end of those words. For example, I run https://ableist.is, a site to help make people aware of their own ableist language. I sometimes point people to that site. I am regularly told things like "I didn't mean crazy in a bad way, I'm not ableist". And it doesn't matter at all what they meant. What matters is that there are a lot of people for whom that word evokes really bad stuff.

The fact that you are offended that the language used in projects could create hostile work environments indicates that you are not likely part of one of the groups that are routinely discriminated against.

The words you use, the actions you take, do not mean that you are racist (or sexist, or ableist). But it's not about you. It's about people that these words hurt. With all due respect, your intentions mean very little. I did not intend to drop a glass but I did and my wife stepped on broken glass bare foot and got hurt. My intentions there mean very little. End result is my wife got hurt. It's a similar thing with racist words, or ableist words, or other words.

And when words like that are used in projects, people may feel, consciously or not, that it creates a hostile environment. Everytime I hear people use words like "cripple", "lame", "crazy", it feels like yet another micro-aggression. I deal with these things several times a day, every day. Each instance isn't particularly bad. Just like one paper cut isn't particularly bad. But if you add them up at the end of the day, the week, the month, the year, it takes its toll.

Check your privilege.

Jurigag commented 4 years ago

How blacklist is hurting anyone? The origins of this word are not related to skin colors/race, we as people recently added this racist implications to it.

Then why not ban whole black word? We can figure out many combinations with other nouns which can make black people feel offended.

maccath commented 4 years ago

At this moment i am offended by this issue and description of it

Nobody said you were racist/sexist/ableist; we said the language is.

You've been informed and you have a choice.

Make of that what you will.

Jurigag commented 4 years ago

At this moment i am offended by this issue and description of it

Nobody said you were racist/sexist/ableist; we said the language is.

You've been informed and you have a choice.

Make of that what you will.

Yea you said that language is, but also my work environment can be hostile due to those words and that's why i feel offended.

maccath commented 4 years ago

my work environment can be hostile due to those words and that's why i feel offended.

So don't use them?

tomjn commented 4 years ago

This issue is concerned with the implementation and details, it isn’t the venue to air personal political opinions, let’s keep the issue focused, constructive, and move forward.

Jurigag commented 4 years ago

my work environment can be hostile due to those words and that's why i feel offended.

So don't use them?

Why? Those words are not racist for me in the context i use and i will keep using them. Words are not racist, people using them and their behavior to offend other people is.

This issue is concerned with the implementation and details, it isn’t the venue to air personal political opinions, let’s keep the issue focused, constructive, and move forward.

I agree, then change description about things like hostile work environment or that those words without any context are racist. First post here has already air personal political opinions, that's why i am concern about this, i agree about the idea, but i feel this is yet again some kind of attack to other people like hey, you are racist or you may have hostile work environment if you use those words currently

vavroom commented 4 years ago

Why? Those words are not racist for me in the context i use and i will keep using them. Words are not racist, people using them and their behavior to offend other people is.

I can only repeat what I said earlier. It's not about you. It's not about your intentions. It's about how people can react to these words.

Using racist or ableist words don't necessarily make you racist or ableist. You may not intend to create a hostile work environment. Nobody is saying you are racist. But. It's not about you.

Check your privilege.

Jurigag commented 4 years ago

Why? Those words are not racist for me in the context i use and i will keep using them. Words are not racist, people using them and their behavior to offend other people is.

I can only repeat what I said earlier. It's not about you. It's not about your intentions. It's about how people can react to these words.

Using racist or ableist words don't necessarily make you racist or ableist. You may not intend to create a hostile work environment. Nobody is saying you are racist. But. It's not about you.

Check your privilege.

And i just disagree with this, because this way we will just go to ban whole black word, simple as that. And that's what i also propose if we want to eliminate and racist implications in our code.

This issue and feature would be great - but without racist implications and suggesting that someone has hostile work environment because they use them. There are many other to explain why we should not use blacklist/whitelist in programming, like they are not self explenatory - allowlist/denylist are much more.

vavroom commented 4 years ago

Check. Your. Privilege.

'nuff said.

Jurigag commented 4 years ago

And i checked, currently there is freedom of speech, and i can use any words i want. And anyone has privilege to it. Ideas like this are trying to remove some words from use/vocabulary and to reduce freedom of speech, because someone suggests that they have racist implications. You still didn't answer why not ban black word.

You are currently saying that no matter what words i use and what i mean - if someone of other color of skin feels offended by it - i am racist. This logic is just wrong.

jrfnl commented 4 years ago

@Jurigag I'm going to ask you kindly to remove yourself from this discussion.

  1. Like @tomjn said, your comments are not adding anything relevant to the issue at hand and can be interpreted as hostile and destructive to the discussion.
  2. Words like blacklist and whitelist are coming from a racist history. They are metaphors where "white" was associated with "good" and "black" with "bad". The fact that you don't intend them to be perceived as racist, doesn't mean they are not. Please do a simple internet search and educate yourself before commenting on these kind of issues again. P.S.: and that is something completely different from using the word "black" purely as a colour, which is the literal meaning and if used as such, not a problem.
  3. Even if you don't see it, because frankly that's irrelevant, non-inclusive language is part of the problem and causes micro-aggression on a daily basis, as @vavroom explained far more eloquently.
  4. As has been said before: Check Your Privilege. You say "there is freedom of speech", well that may be the case in your country. You can disagree with this issue, again, it is a privilege that you have the freedom to do so. Please do a simple internet search on privilege and educate yourself.
  5. Nobody is forcing you to use this sniff once it is created.

Please regard this as a formal warning.

benlk commented 4 years ago

Is this just for American English, or will there be region-specific sniffs for other countries' dialects of English?

Is there a significant non-English-speaking PHP community that would justify creating a set of sniffs for non-English languages?

The reason I ask these questions is because separating sniffs by region or language may be easiest to implement from the beginning, rather than adding afterwards once people have integrated the first-contributed sniff into their workflow.

benlk commented 4 years ago

A downside of region-based or language-based sniffs is that it would result in code duplication across sniffs where different cultures share some noninclusive words or phrasings. To reduce code duplication, would it instead make sense to have separate sniffs for each separate sort of noninclusive language, allowing sniff-runners to choose which noninclusive language sniffs apply to their situation?

As an example, having a sniff for disabled might cause problems for codebases that deal with <input> elements, whereas a codebase that doesn't deal with <input>s might prefer to include that sniff.

jrfnl commented 4 years ago

@benlk Thanks, that's useful input and actually something I have been thinking about, though I haven't taken a decision yet.

My current thoughts are along the following lines:

Code duplication won't be much of an issue as that can be prevented by using an abstract sniff and/or traits for the shared code. It's one of the reasons this sniff library is build on top of PHPCSUtils which offers a lot of that kind of tooling to make my life easier ;-)

benlk commented 4 years ago

You've reminded me of the ignore comments, and while I agree that those work in some situations, I'm not sure that they're the right option for inclusiveness sniffs. Having one monolithic NonInclusiveLanguage sniff implies that there is One True Way™ to do Inclusiveness™.

Including by default a sniff for gendered language would be anti-useful to an organization whose practice of inclusiveness involves gendering people by their desired gender. (Anecdote: As they/them is increasingly used for a third gender role in English-language discourse, the more people I see who reject its indiscriminate application to everyone as a form of mass misgendering, each instead desiring for themself he/him or she/her.) If such an explicitly-gendered organization wanted to add inclusiveness sniffs, requiring them to add comments around all their gender-someone-correctly code would be an obstacle to incorporating all of the other inclusiveness sniffs.

I've already said my piece on cases where a sniff for disabled might or might not be wanted, but the barrier for adopting inclusiveness sniffs is higher for an organization that would need to sprinkle their codebase with comments in order to adopt a monolithic NonInclusiveLanguage sniff.

Separating the inclusiveness sniffs into separate sniffs cuts the Gordian Knot of competing access needs by allowing each sniffer to use the sniffs that suit their community's needs, without requiring defensive commenting against the sniffs that satisfy other communities' needs.

There's precedent for splitting out sniffs: this repo has contradicting sniffs for the PHP short list syntax; it doesn't package a monolithic ListSyntax sniff.

tomjn commented 4 years ago

I can foresee that adding a choice of pronouns on a user profile would be caught by the sniff as originally proposed. Is there a mitigation that can be applied? Such as restricting to code comments, or testing for conditional structures or pronoun selections in surrounding code?

jrfnl commented 4 years ago

I can foresee that adding a choice of pronouns on a user profile would be caught by the sniff

AFAICS, the sniff as proposed would not be triggered on that as those would be contained in text strings and the proposal does not cover those.

Ayesh commented 4 years ago
theresenabl commented 4 years ago

@jrfnl:

Words like blacklist and whitelist are coming from a racist history. They are metaphors where "white" was associated with "good" and "black" with "bad". The fact that you don't intend them to be perceived as racist, doesn't mean they are not. Please do a simple internet search and educate yourself before commenting on these kind of issues again.

That is simply a lie. Black-and-white dualism has a loooong history of being used as a metaphor. It's traceable to 4th century BC and is connected with the day cycle when after the day comes the dark night. It goes back to "Table of Opposites" by Pythagoras (yes, the same Pythagoras). Like you said, please, next time, educate yourself and don't create your own theories.

As it goes for blacklist/whitelist it is overused in tech world and sometimes there are better alternatives for names that say more for the rest of programmers. Is it a role of CodeSniffer to check on that? I am not sure. A lot is lost without context and I think that sniffer won't ever work properly.

vavroom commented 4 years ago

That is simply a lie

I have a problem with such accusations. A lie generally implies intent to deceive.

Perhaps the information @jrfnl found isn't accurate - but it is "common knowledge" that is floating around a LOT and I wouldn't blame anyone for accepting it as making sense.

benlk commented 4 years ago

It's probably beyond the scope of this sniff to suggest specific alternatives to blacklist/whitelist and master/slave, because what the functions labeled with those words actually do varies by project.

However, a sniff for those words could link to lists of suggested alternatives, such as https://tools.ietf.org/html/draft-knodel-terminology-01#section-1.1.1 for master-slave and https://tools.ietf.org/html/draft-knodel-terminology-01#section-1.2.1 for blacklist-whitelist. Those two links are from a draft RFC that never made it past the draft stage, but I'm sure there are similar yet authoritative lists from standards bodies that can be referenced in the sniffs' messages.

tomjn commented 4 years ago

Choosing a specific perfect replacement for master/slave or whitelist blacklist shouldn't be necessary, there are lots of alternatives that fit various contexts while remaining inclusive.

Foxar commented 4 years ago

We should just straight up use 1984's Newspeak to be 100% sure no-one is offended by anything.

Foxar commented 4 years ago

I thought I lost all faith in humanity, but if the above poster missed my sarcasm , then I lost that one bit of faith in humanity I didn't know I still got.

You people are so paranoid about racism x-phobias and 'oppression' even if we all were Dr. Who identical Cybermen you'd still complain. You stop disenfranchisement of minorities by stopping seeing them as fragile minorities in need of help but as your equal. You stop racism by ignoring race, not making everything about race. And censoring half of the damn language to try and make offensive language impossible is, ironically, impossible itself.

TL;DR Chill the *** out everyone.

tomjn commented 4 years ago

This is a professional setting. Juliettes warning was not an invitation to express personal political beliefs. The question of wether this has any impact on people is not the purpose of this issue ( and has been thoroughly answered elsewhere, and with firsthand examples here ).

Let’s move forward constructively and respectfully.

Foxar commented 4 years ago

Is disagreeing with the change altogether constructive and respectful?

jrfnl commented 4 years ago

Is disagreeing with the change altogether constructive and respectful?

@Foxar No, that's not helpful and outside of the scope of this issue.

Nobody is forcing you or anyone else to use this sniff once it is created, so if you disagree with the principle of it, just don't use it. It's as simple as that. You don't need to tell us, you don't need to add a comment to this discussion. Just don't use it.

This issue is open to allow people who are interested in using this sniff to voice their opinion about the proposed implementation, nothing else.

jdevalk commented 4 years ago

Saw passlist / stoplist as suggestions here:

https://twitter.com/dan_abramov/status/1272242325029257223?s=21

vavroom commented 4 years ago

Alternative for "crazy": "weird"?

@JapanYoshi Yes, weird could be a good alternative. Depending on context, wild could also be used. There are several possibilities.

zlodes commented 4 years ago

I want to believe that this is a joke... ☹️

Big-Shark commented 4 years ago

@Jurigag I'm going to ask you kindly to remove yourself from this discussion.

Democracy and freedom of speech are not welcome here.

lsmith77 commented 4 years ago

for word lists have a look at https://alexjs.com/

lsmith77 commented 4 years ago

also just removing belitteling word https://github.com/OskarStark/doctor-rst/blob/master/src/Rule/BeKindToNewcomers.php