inclusivenaming / website

Website for the Inclusive Naming Initiative
https://inclusivenaming.org/
Creative Commons Attribution 4.0 International
28 stars 39 forks source link

chore: better suggestions for blacklist / whitelist #45

Open Nytelife26 opened 3 years ago

Nytelife26 commented 3 years ago

See Google's v8 PR and the KCS open statement for more information.

edwarnicke commented 3 years ago

@Nytelife26 Thanks for raising this. It's good to get the perspective of folks beyond just English speakers.

One question though: It wasn't entirely clear to me from the linked issue, are you advocating for:

list of blocked${something} and/or blocked ${something} list

or were those intended to be illustrative?

edwarnicke commented 3 years ago

@Nytelife26 Apologies... I parsed this first time through as an Issue and not a PR, and thus didn't look at the actual change you proposed, which is clearer :)

Nytelife26 commented 3 years ago

or were those intended to be illustrative?

Mostly illustrative. Solutions for this are not yet concrete, but it is clear to see that the ones decided upon currently are not suitable, and so we need new ones. Those solutions sacrifice brevity for clarity, which shouldn't be an issue, but I believe people may encounter issues with the verbosity. Any better ideas are always welcome and appreciated, so long as they are not more harmful than good.

Apologies... I parsed this first time through as an Issue and not a PR, and thus didn't look at the actual change you proposed, which is clearer :)

No worries! Always glad to see people asking questions - it means they're paying attention and willing to see from the other side.

markcmiller86 commented 3 years ago

Sorry, didn't mean to comment inline with the specific change being discussed...

@markcmiller86 Perhaps, can you think of any examples where that is the case?

Well, any example would ultimately have something to do with cases where the author of the material (or the interface/mechanism being described) for some reason has a sort of double-negative situation. So, if the default behavior is to prevent access except in special cases...inclusion might wind up meaning to be included in the list of things that are prevented access and exclusion might be to be in the list of things where access prevention does not apply. Its probably a silly example. But inclusion and exclusion leaves the question of in what sligthly vague whereas permit and deny seem to be to be less so.

One other dimension to this discussion is the distance between the two (opposing) words. Ever get tired of telephone menu systesm that say hit 1 for yes and 2 for no? Those two buttons are right next to each other on a telephone keypad. Its far too easy for someone wanting to hit 1 to actually hit 2 and vice versa. It should be 1 for yes and 9 for no to reduce this error rate.

Am wondering the same thing about these word pairings. ex and in are the only letter diffs in these 9 letter words which otherwise, look (think dyslexia) and sound (think hearing impairment or speech impairment or simply strong accents) quite similar. Is this a problem?

Nytelife26 commented 3 years ago

inclusion might wind up meaning to be included in the list of things that are prevented access and exclusion might be to be in the list of things where access prevention does not apply.

That depends largely on context. Ultimately, that's the point of calling them "inclusion list" and "exclusion list" - they are inclusive or exclusive of an entity from any given situation at hand, which is exactly what "whitelist" and "blacklist" meant, only clearer.

look (think dyslexia) and sound (think hearing impairment or speech impairment or simply strong accents) quite similar. Is this a problem?

I am not entirely sure, actually. The ex sound as opposed to in is a strong enough difference to make speech impairments, hearing impairments, and strong accents able to work with them.

I cannot weigh in on dyslexia though, as I do not have it, and I fear that speaking to people I know that do have it will result in the "{x} people do not speak for everyone" argument.

"Authorization list" and "prohibition list" would work fine even if not, but be less accessible to those of lower English proficiency - although, at least translatable.

markcmiller86 commented 3 years ago

That depends largely on context. Ultimately, that's the point of calling them "inclusion list" and "exclusion list" - they are inclusive or exclusive of an entity from any given situation at hand, which is exactly what "whitelist" and "blacklist" meant, only clearer.

Ok, that make sense.

I am not entirely sure, actually. The ex sound as opposed to in is a strong enough difference to make speech impairments, hearing impairments, and strong accents able to work with them.

While I am inclined to believe that, I am not a linguistics/hearing subject matter expert. So, I wonder if we have better metrics than just our own (potentially biased) sense of things. But, this does raise a key point, for me anyways...that I've only been thinking in terms of written word and the truth is, people will also need to speak and hear these words and we should be operating with some sensitivity to that as well. Up until this dialog, I had not.

I cannot weigh in on dyslexia though, as I do not have it, and I fear that speaking to people I know that do have it will result in the "{x} people do not speak for everyone" argument.

I have number dyslexia :smile: (which I think is now called dyscalculia) and word dyslexia when I am very fatigued.

Nytelife26 commented 3 years ago

better metrics

Unfortunately, I'm not entirely certain how we could do that without conducting, to some extent, a study on the matter via experiment and statistical evaluation. As far as my knowledge of linguistics go, it should be fine, but maybe you're right, it might be better to measure these things up.

people will also need to speak and hear these words and we should be operating with some sensitivity to that as well

Interesting point. Maybe I should add that into the accessibility metric of my evaluation framework.

I have number dyslexia [...] and word dyslexia when I am very fatigued

In your experience, would you find the two words hard to distinguish? Once again, I know you don't speak for everybody with the condition, but I'm just curious. All the best to you with that though - my father has them too and I help him proofread his writing.

markcmiller86 commented 3 years ago

In your experience, would you find the two words hard to distinguish? Once again, I know you don't speak for everybody with the condition, but I'm just curious. All the best to you with that though - my father has them too and I help him proofread his writing.

For written word, I think my experience of confusing inclusion with exclusion is low. For clear (e.g. no noise or signal degredation due to any of a number of factors) spoken word, I think the risk of confusion is low but not as low as for written. Outside of that, possibly narrow, situation, I think the risk of spoken-word confusion goes up to at least moderate and maybe even high.

Nytelife26 commented 3 years ago

Outside of that, possibly narrow, situation, I think the risk of spoken-word confusion goes up to at least moderate and maybe even high.

That's an interesting roadblock. I may have to trial this.

Although, under these other circumstances, is it not also likely that the sentences will be misheard altogether? Perhaps this is my limited view, but I am not sure how ex can be misheard as in or vice versa unless the sound is ommitted in hearing altogether, at which point of course one would need to request that the speaker repeats themselves.

If this is an issue we may need to incorporate spoken considerations into our suggestions.

As aforementioned, though, these are not concrete guidelines. If necessary when vocalizing such phrases people are welcome to use one of the other alternatives, such as the more concise "list of banned X" or "list of allowed X".

If anyone else has any thoughts on this, I would be greatly interested in hearing them.

edwarnicke commented 3 years ago

@Nytelife26 One other thought that occurred to me over the weekend. 'whitelist' and 'blacklist' are also used as verbs meaning 'to add something to the list of things allowed/permitted/included (or denied/excluded)'. Its seems like the conversation so far has been around those terms as nouns... do you have thoughts on them as verbs?

Nytelife26 commented 3 years ago

do you have thoughts on them as verbs?

It occurs to me that the verb usage is actually the easiest part of the problem - you can easily substitute with "exclude" and "include", or "allow" and "prohibit", just as some examples. You do not ever need to use the phrases themselves as verbs.

Does anyone say "mailing-listed", or are you likely to say "You've been added to the mailing list"?

The same should go here. You should either use an appropriate substitute verb, or use phrasing similar to "You have been added to the [inclusion / exclusion] list".

markcmiller86 commented 3 years ago

That's an interesting roadblock. I may have to trial this.

So, I don't wanna create roadblocks and sorry if my excursion into these other aspects did. For this PR, its likely best to remove the questions I raised regarding the potential for confusion in various settings and due to various impairments that may be present. That may be somewhat unique to the case of word-pairings intended to have opposite meanings anyways.

That said, I do think consideration of those questions as part of the process of rating replacements may lend even more credibility to our final recommendations.

Nytelife26 commented 3 years ago

So, I don't wanna create roadblocks and sorry if my excursion into these other aspects did.

Oh, no, it's fine, I didn't mean it like that. I'm just saying it's a very interesting perspective, and one that might add some necessary complexity to the problem at hand.

Which is fine - if anything, that furthers the need for this change.

may lend even more credibility to our final recommendations

Definitely. We need to consider as much as possible before arriving at something concrete. For now, though, we have a good set of suggestions.

What's the next move?

markcmiller86 commented 3 years ago

What's the next move?

One thing I am inclined to start doing is gathering together some existing refs on metrics for language. @quaid already mentioned plain language. I think there are things related to measuring distance between words (written and spoken) and I would love to see something that indicates relative (or absolute) probabilities of misinterpretation (due to impairments of one form or another) of terms. If such things exist, it would be good to gather them for review to understand if/how they could be used profitably for our goals.

Nytelife26 commented 3 years ago

Interesting. I'll look into plain language when I get the time.

Ultimately, aside from the criteria I laid out in my statement, we need to consider interpretability, clarity, and proliferated meanings.

carlaquinn commented 3 years ago

Considering translation is really important. I've worked with our translation centers as we've updated terms at IBM, providing new terms with definitions and other explanatory information. We now have a full set of approved equivalents (translated terms) for "allowlist" and "blocklist" in all of the languages we translate to. It's always important to try to choose terms that make sense and can be translated effectively, but providing terms up front with supporting material is also a good way to ensure good translations.

Are we going to eventually share translations of our terms?

Nytelife26 commented 3 years ago

Considering translation is really important. I've worked with our translation centers as we've updated terms at IBM, providing new terms with definitions and other explanatory information. We now have a full set of approved equivalents (translated terms) for "allowlist" and "blocklist" in all of the languages we translate to. It's always important to try to choose terms that make sense and can be translated effectively, but providing terms up front with supporting material is also a good way to ensure good translations.

Are we going to eventually share translations of our terms?

The point here was to highlight the issues with the currently used alternatives. Having to form new translations makes things more difficult for both the people working on these projects and the people using the languages themselves.

I would, however, encourage IBM to release this information so others can make use of it. Otherwise there's no point having it at all.

Either way, the alternatives we have highlighted are designed with translation in mind. There has been and will be no need to co-ordinate new terms in any languages because these terms are directly translatable.

Part of the issue is that "allowlist" and "blocklist" are not valid English themselves. If something isn't valid in its host language how can we translate it to any targets?

Thank you for sharing, and weighing in as an employee under a high profile enterprise. If IBM has any intention to release its findings to the public, that would be great.

markcmiller86 commented 3 years ago

Part of the issue is that "allowlist" and "blocklist" are not valid English themselves. If something isn't valid in its host language how can we translate it to any targets?

I am not the convinced the valid-english-ness of these examples is either very much true or a very strong argument as a general principle. I worry that so much of tech-industry language is of this ilk that seeking onl valid English replacements may too significantly limit our choices.

Nytelife26 commented 3 years ago

either very much true or a very strong argument as a general principle.

You will notice something about compound nouns in the verb + noun form - they are all separated by a space (two words, rather than joined together) and are almost all use the -ing form of the verb. English does not have many set rules, so it's more about what makes the most sense.

However, neither "allowlist" nor "blocklist" conform to these standard principles. In general, they do more harm than good, as stated earlier in this thread. Even saying "a blocked list" would mean to block the list rather than its contained items. A blocking list, however, might make sense, but still sounds somewhat unusual.

This is a problem that persists from the original terms themselves, too. We have the chance to fix it now we are shifting away from the originals,so we should.

only valid English replacements may too significantly limit our choices.

The issue here is that English is not the only language on Earth. By treating it as such, we create linguistic proficiency barriers for those of foreign backgrounds and also of lower proficiency in general.

The terms you create in a language should conform to the language they are intended to be used in. Not following that both creates unusual and invalid constructs in the host language, but also makes them impossible to translate without arranging replacements in target languages, as kindly reinforced by Carla.

carlaquinn commented 3 years ago

Not impossible to translate, but we like to establish and record equivalents for terms so that we can have consistency. I'm not sure what the best format to share translations is -- we'll need to set something up in the spreadsheet. I can post the translations here if there's immediate interest

Nytelife26 commented 3 years ago

Not impossible to translate, but we like to establish and record equivalents for terms so that we can have consistency.

If you need to find a synonym instead of being able to translate terms directly, the terms are, by definition, not directly translatable. That's fine of course, but unnecessary and inaccessible compared to terms that are directly translatable. Not to mention there isn't a reverse map for these other languages. But I see your point.

I can post the translations here if there's immediate interest

That would be grand, thank you.

carlaquinn commented 3 years ago

@Nytelife26 I think we misunderstand each other. I provided "blocklist" and "allowlist" to our translators and these are the terms they translated. These terms are directly translatable. Here are the translations for "blocklist". I'll post more once we figure out the best way to do it.

Brazilian: lista de bloqueios Part of speech: noun Bulgarian: списък с блокирани Part of speech: noun Catalan: llista de bloquejos Part of speech: noun Croatian: lista nedozvoljenog Part of speech: noun Czech: seznam blokování Part of speech: noun Related terms: seznam povolení Czech: seznam blokovaných Part of speech: noun Danish: blokeringsliste Part of speech: noun Dutch: lijst van geblokkeerde afzenders Part of speech: noun Related terms: lijst van toegestane afzenders Finnish: estolista Part of speech: noun French: liste rouge Part of speech: noun Related terms: liste autorisée German: Blockierliste Part of speech: noun Related terms: Zulassungsliste Greek: λίστα αποκλεισμένων αποστολέων Part of speech: noun Related terms: λίστα επιτρεπόμενων αποστολέων Hungarian: tiltólista Part of speech: noun Indonesian: daftar blokir Part of speech: noun Italian: elenco bloccati Part of speech: noun Related terms: elenco consentiti Japanese: 不許可リスト Part of speech: noun Japanese: 警戒対象リスト Part of speech: noun Japanese: ブロック・リスト Part of speech: noun Korean: 차단 목록 Part of speech: noun Related terms: 허용 목록 Malay: senarai sekat Part of speech: noun Norwegian: blokkeringsliste Part of speech: noun Related terms: godkjennelsesliste Polish: lista zablokowanych Part of speech: noun Related terms: lista zaakceptowanych Context: MAIN0 Portuguese: lista de bloqueios Part of speech: noun Related terms: lista de permissões Romanian: listă de blocare Part of speech: noun Russian: черный список Part of speech: noun Related terms: белый список Simplified Chinese: 阻止列表 Part of speech: noun Related terms: 允许列表 Slovakian: zoznam blokovaní Part of speech: noun Related terms: zoznam dôveryhodných pripojení Slovenian: seznam blokiranih Part of speech: noun Related terms: seznam dovoljenih Spanish: lista de elementos bloqueados Part of speech: noun Related terms: lista de elementos permitidos Swedish: blockeringslista Part of speech: noun Thai: รายการที่บล็อก Part of speech: noun Related terms: รายการที่อนุญาต Traditional Chinese: 封鎖清單 Part of speech: noun Turkish: engelleme listesi Part of speech: noun Related terms: izin listesi

Nytelife26 commented 3 years ago

Quite a few of those are not direct translations, which is what I was pointing out. For instance, the Spanish "lista de elementos bloqueados" is "list of blocked items" - a similar phrase, and one that would be better suited, but not a direct translation of blocklist. It is also admittedly an unusual construct, since Spanish is a very contextual language, but I suppose understandable.

I obviously do not know every language you have provided, but many are not direct - and it makes more sense to use already existing terms with more clarity and the same meaning rather than proliferating the language and making new ones.

Nytelife26 commented 3 years ago

Status on this, anyone?

Nytelife26 commented 3 years ago

For a broader scope of discussion, I'll be making a similar pull request into Chromium's guidelines later, so we'll see how that goes. Thank you to everyone that has participated so far.

markcmiller86 commented 3 years ago

Quite a few of those are not direct translations

Just curious but given that organizations like Intel, IBM and Google (I mention those as examples of organizations with international scope) have already indicated that differing contexts/communities likely require different replacements anyways, how big a priority do we think direct translation should be?

Nytelife26 commented 3 years ago

Just curious but given that organizations like Intel, IBM and Google (I mention those as examples of organizations with international scope) have already indicated that differing contexts/communities likely require different replacements anyways, how big a priority do we think direct translation should be?

It may not be direct, but the fact is, those aren't translations at all. Blocklist and the like are not translatable without first knowing what they mean in their host language and then using the closest synonymous representation that has a translation. Vaguely equivalent to me creating a term, for instance, banentry, that does not work in its host language, and then claiming it is fine because you can infer what it means and it can be translated through "prohibition list entry" instead.

It doesn't work like that. Not to mention that, in instances where a human translator is not available, machine translations (like google translate itself) MUST be able to bridge the gap.

edwarnicke commented 3 years ago

@Nytelife26 I think I tend to think of the distinction you are making in terms of translation vs transliteration. You can't transliterate an idiom (and I think your underlying point is to many terms are idiomatic). If you translate the component parts of an idiom you lose the meaning. So you are forced to increase the vocabulary of translations to include the idiomatic phrases themselves... which can be done, but is both much more work (especially across multiple languages and idioms) but also much more error prone.

Nytelife26 commented 3 years ago

@Nytelife26 I think I tend to think of the distinction you are making in terms of translation vs transliteration. You can't transliterate an idiom (and I think your underlying point is to many terms are idiomatic). If you translate the component parts of an idiom you lose the meaning. So you are forced to increase the vocabulary of translations to include the idiomatic phrases themselves... which can be done, but is both much more work (especially across multiple languages and idioms) but also much more error prone.

Precisely. It is more work, less accessible, more error prone, and creates an unnecessary blockade for people - which is something we should strive to avoid when there are better alternatives.

Nytelife26 commented 3 years ago

Are there any further discussions to be had about this? The verdict seems pretty clear so far, but I am open to further critique or questioning.

edwarnicke commented 3 years ago

@celestehorgan @justaugustus Has this been raised in the language workstream?

carlaquinn commented 3 years ago

We are just beginning our discussion of terms in the language workstream, so this discussion seems a little premature.

Nytelife26 commented 3 years ago

We are just beginning our discussion of terms in the language workstream, so this discussion seems a little premature.

Where does this take place? I would be interested in participating if it's an open thing.

edwarnicke commented 3 years ago

@Nytelife26 the language workstream is indeed open :) Their meetings and comms channels are listed here: https://inclusivenaming.org/workstreams/ :)

Nytelife26 commented 3 years ago

I will be discussing this at the next INI meeting, as I am now a member of the language workstream.

Nytelife26 commented 3 years ago

Now that we've started our process for writing recommendations, this should almost certainly come up in the list of terms to discuss. This should be resolved soon :)

Thank you all, and there are some amazing people to work with here. I do not intend to make this my first and last contribution.