Closed ghost closed 8 years ago
Good point, @dpyro, this is a technical issues forum, not Tumblr, take your "social issue in technical clothing" to a personal rant elsewhere, preferably a personal journal or a psychologist, please.
I know, it's hard for some people lower on the bell curve to distinguish the difference between two completely different web sites.
I mean, both github and tumblr have an editing box and a submit button! So confusing. .
There seems to be 2 issues here:
1 - The "we shouldn't guess gender at all" nonsense - i'd suggest that the project maintainer completely ignore such a downright stupid request and anyone offended by a piece of software having this feature go and fork the code to remove it.
2 - The algorithm used is not accurate - the solution here is to mark it as experimental and do the work to bring it up to a higher accuracy.
The actual subject line of this issue is #1, and so I would suggest closing this one as "won't fix" and not wasting time trying to satisfy people who will never be satisfied but instead focusing on actually improving the code.
On the technical side of things, may I suggest a probabilistic approach to gender guessing? Not even humans are able to perfectly guess gender from arbitrary natural language strings, but it should be possible to get close to human levels of performance. Personally i'd just throw it all into a bayesian classifier but there's probably a better way to do it.
I think identifying gender is good. There are several uses that align with social justice in a feature and I think that in identifying it, we can find a common ground where this repository will not need to be changed and activists can be happy. For example:
finally, I want to add. "they" used for neutral singular, sounds cacophonic for non-native english speakers like me. It doesn't feel inclusive since you're not thinking how immigrants will feel when you deliberately decide to change a language that takes years to learn. 'They' already has a use for plural, you should choose either a new word or one that fits but its widely unused.
if your consensus seems to be "don't use issue trackers", why enable it for the project?
I really hope that's a joke nobledemon.
Guessing gender, just like guessing subject, is a perfectly legit feature for a natural language library to have. A small minority find it offensive to guess gender, I say ignore them and do not make your code less useful just to appease them.
finally, I want to add. "they" user for neutral singular sounds cacophonic for non-native english speakers like me. It doesn't feel inclusive since you're not thinking how immigrants will feel when you deliberately decide to change a language that takes years to learn. 'They' already has a use for plural, you should choose either a new word, one that fits but its widely unused.
@nobledemon i hope that's a joke… you are aware singular they
has been around since pre-shakespearean times? also since ages before as the pluralis maiestatis
.
@nonchip Issue trackers for open source projects are not the correct place to argue semantics of how people communicate. If you think people shouldn't speak English because its gendered, fork the language. If you think this library shouldn't have gender, then fork it. The issue tracker is for people who encounter issues using the project.
@herpdederp: well, i did encounter an issue: this project is trying to implement the impossible, very poorly.
@GarethNelson It's a joke directed to the bug openers. To show them how useful would such a feature be for their regular campaigning. But perhaps if they see legitimate utility on this they will quiet down.
@nonchip Not joking. Perhaps for a philologist is obvious, but as a non-native english speaker what you will learn about english is that 'they' is plural. And this has been like this for a long time for foreigners and will continue to be like this. You have to find another solution since you will not manage to stablish your change in foreign primary schools teaching english.
If you are so eager in people changing stuff when they might be offensive, I recommend you to not dismiss me automatically. Also majestic plural doesn't seem like a strong case, nor any other use of "they" that has much less use than the actual use which is neutral plural. I don't see why this focus on 'they' really, it's just problematic for foreigners and you don't need specifically to be 'they'.
@scanlime Only an idiot will feel this to be offensive, so you people lost the fight against gamergate, now coming here to mess with this community, get a life! how is this "issue" preventing you from using the library? if you cant use it, fork it and change it
@nonchip I'm going to give the benefit of the doubt and assume you're actually working on a project that uses this library. There is no possibility this would impede you in any ways shape or form. If you've somehow run into a problem, I'd be more than happy to help.
It is unclear how this "issue" impedes your use of the library. Can you clarify?
I am frankly shocked to find that neither @scanlime or @nonchip have any activity on their GitHub accounts involving NLP software: https://github.com/nonchip?tab=activity https://github.com/nonchip?tab=repositories https://github.com/scanlime?tab=activity https://github.com/scanlime?tab=repositories
I might even go as far to say that their issues with this are purely social and not technical in nature.
To be honest, I think @scanlime 's projects are pretty cool, especially since I find hardware stuff intriguing. However, it's not cool when you come to a non-related project and make someone else do work for you solely on the basis that you feel a certain way.
It also shows a level of ignorance when you cannot consider that not everyone shares your views and that your idea of what "reality" is still in hot debate and is a charged issue. When you open a bug report with a question like "Would you write code to guess someone's race using heuristics?" you are bringing an additional charged issue into the mix. I cannot imagine that someone would do this without meaning some level of disrespect.
Also, considering that you made a Twitter post about this issue: https://twitter.com/scanlime/status/731538789726748673
I can't help but feel that you're getting something more than technical from this.
This really isn't cool because now you've purposefully thrown someone's project into the court of public opinion. You've put the maintainers in a very uncomfortable position in which they have to make decisions about their project while tiptoeing around issues that people feel very emotional about. You've weighed your personal beliefs over their hard work and have imposed them gratuitously.
If you honestly want to help this project then make a PR; do something that contributes! But making bug reports for things you [probably] don't even use isn't really a bug report. All you've done is brought conflict while serving yourself, and that's incredibly messed up.
@scanlime Let's get something out of the way, heuristics are usually unreliable given the context, however heuristics being used for "guessing race" doesn't mean I'm going to wake up with leprosy when it thinks I'm white.
Just because you find this feature offensive and "broken" doesn't mean others do, they could see a guess as reliable enough to add into their stack. To remove this feature only because it managed to trigger you, would be tailoring to the needs of children masquerading as adults.
By dragging these social justice issues into open source you not only create a toxic environment where nothing gets done (because threads like this occur) you risk putting maintainers under fire because they disagree with you on non-issues.
If you can write a better implementation of the feature that meets everyones needs, submit a pull request. If you're calling for its removal I ask that you read this because you're contributing nothing.
The next time you try to use race as a platform to get your really crappy point across, ask yourself why you feel the need to be offended on my behalf. Software is going to guess a lot of things about you and it being software can't always be right. The sooner you accept that, the sooner you can contribute something useful.
A reminder though, you don't speak for other people, the people you claim this may offend probably aren't offended and if you're so upset over this, why haven't you gone after Facebook, Microsoft and Google for their face recognition that guesses if its a boy or girl?
@scanlime Just dropping by to comment, and yes, I do support GamerGate, but that is immaterial to my objection, which are that this project was designed to use language used by the majority of humanity in a Javascript environment, and from a code and common sense standpoint, the majority of humanity uses gendered language.
To cast this aside in favor of what you admit is your own personal taste is going against the intended use for many of it's projected end users. Forking an alternative that supports the minority this is not intended for is a perfectly acceptable alternative.
Further, I'd like to address some of your tweets on this matter:
https://twitter.com/scanlime/status/731761464491409408
In this one, you disparage a commenter because they do not have many stars on their projects. Frankly, that's immaterial and is assuming the popularity of one's projects gives their opinions greater or lesser weight. Star merely indicate someone's interest in a project, they do not indicate whether the opinions of the submitter are more valuable.
Also, I object to this language:
https://twitter.com/scanlime/status/731780378541719552
For someone who complains about the use of certain language you consider potential slurs in code, you seem to not have reservations using slurs against those who oppose you, and while gender may or may not be an insult to some, attacking someone's opinion based on whatever non coding based ideology they hold is an insult in every community which I am aware.
In essence, while I encourage forking this project for the desired use of yourself and those who share your position, I oppose the imposition of your feelings on this project because the majority of humanity still uses the language you find troublesome, and I respectfully suggest making your arguments based more on logic and less on denigration of your detractors might be more conducive to winning over people to your position.
@softwareshouldmaybeguessgender I am frankly shocked to find that neither @scanlime or @nonchip have any activity on their GitHub accounts involving NLP software…
…and?
You've weighed your personal beliefs over their hard work and have imposed them gratuitously.
if "software shouldn't try to do magic" is a personal belief, then yes.
@rubenwardy That's bs, you can easily get the gender from these sentences accurately.
yeah, that's also bs, because then you can just replace the whole library by some function matching 2 easy example sentences returning hardcoded values.
@arcane21 , and from a code and common sense standpoint, the majority of humanity uses gendered language.
whoops, wrong. quoting what I linked above again: "1/4 of the world's languages". also, while humanity more or less manages to get it done, this feature using 2 hardcoded lists certainly doesn't.
This is insane, why this "issue" is even still open is beyond me.
I agree with @Inirit. Gender politics doesn't belong here.
I think outside the scope of this issue, someone somewhere might be able to find a use... Don't use the feature personally if you don't like it.
Voting to close, this is not a software issue, totally off topic 👍
It's not a bug, it's a feature.
@speakeasy No it isn't. This is clearly about removing a feature that might be and apparently is useful to people.
In response to the main thread @scanlime:
People shouldn't guess gender and gendered pronouns. Neither should software. Would you write code to guess someone's race using heuristics?
I don't think it's about whether a library should or should not, it's about that it can and will at the user's digression. It boils back down to whether people should or should not, because you are the user of the library. Make your own choice with the library... I personally haven't used it enough to know if it lets you make the choice (I just thought this was interesting discussion), and if it doesn't then it should.
My last statement. Thank you.
While this has been clearly stated I would like to offer some insight where all this comes from. This issue is not one of software design but one of ideology. The "gender-neutral language movement"[1] is a prescriptivist group aiming to change languages with natural gender.
An NLP software library is not to be ideologically inspired. It should offer tools for extracting structured and useful information from natural language. As long as English has natural gender and gendered terms, it is useful to extract this information. Prescriptivism is not well regarded in linguistics as being unscientific, normative and very often ideologically biased.
@nonchip I would say the same about MongoDB, but then again you wouldn't see any bug report taken seriously that says "Document-oriented databases shouldn't try to hold data when most data is relational".
I honestly do not think that highly about the algorithm. It obviously will not work in all cases and it is also "experimental". It's blunt enough to deserve some criticism and probably does not solve the problem that it's made to solve.
But that's not the issue at hand.
Let me reiterate the original post in this bug report: "People shouldn't guess gender and gendered pronouns. Neither should software. Would you write code to guess someone's race using heuristics?"
This is an assertion. This does not describe any sort of behavior after the code has been run. Did @scanlime run the code on a list of their friends after seeing this and provide output showing that it was wrong? Was Micah working on a project and ran into a barrier because of this particular part of the library? Bugs imply testing. Was there any testing done with this or did you already make up your mind before you even touch an interpreter?
Also, notice the language of "should", "shouldn't", and "would" instead of "did", "didn't", "has", "hasn't", etc. The words that they used are rooted in opinion. In addition, dragging race into this makes it an especially flammable one.
Contrary to what others are attempting to drag into this conversation: This is not an issue about social justice. This is not an issue about Gamergate. This is not an issue about advertising. This isn't even an issue about gender pronouns. I'm not speaking from a perspective of "Us vs. Them."
Someone opened an issue on a project because they felt that their opinion justified them telling someone else to delete their code. The only perceivable action in which that opener would feel satisfied is if the code was removed, even though that feature was optional to begin with.
What bothers me is that instead of letting code that you believe to be ridiculous fall on its own, you step in from the outside and assert that it should not exist. On Github there is a mass of defunct and non-useful code; it doesn't take much effort to browse through random repos to see the sea of deprecated and unmaintained pet projects and legacy stuff from a different time.
This is the nature of code. It has to justify its existence else it will fall out of use and become forgotten. And, my goodness, out of all people it should be a reverse engineer that should understand that.
So what are you trying to do? What's the end goal? Because it feels an awful lot like you're trying to censor someone else's act of creation based on your opinion. This is Github, a place for social coding, not social issues. No one needs your justification to host their own repo and write their own code.
And I just wanted to close with saying that I honestly think we should be considerate to other peoples' feelings. If someone I associate with wants me to call them something that I do not usually call people (I have a handful of trans friends but haven't met a non-binary one yet) then I'm going to call them what they want because honestly it's not that big a deal to me but I know it means a lot more to them. And I like making this decision because it makes me happy that I'm making a friend happy.
However, I do not like the attitude of shoving ideas down peoples' throats. And as you can see in this thread, it makes people feel resentful especially when many try to make people feel horrible for struggling to adjust to new things. The fact that people are adjusting to new ideas should not be a shock to you. Things might go a bit smoother when you try to be as open minded as you accuse others of not being.
Anyways, everyone please be civil to each other. The is an open issue, not the end of the world. You'll find that we can reach better understand when we aren't calling each other names.
When looking on Twitter it seems like the OP's main concern is trying to stir the pot, and her interest is not so much technical, maybe not even social, but mainly focussed on artificially creating outrage. Who knows what her motivation is - maybe it really is making the world a better place, or maybe it has something to do with her linked Patreon page - this still is a topic worth taking about, although not in form of a Github issue.
The meta-conversations regarding this Github "issue" are ranging from borderline hateful to straight up hateful, and if I were the developer of this software, I would feel personally threatened. Even being mostly unaffiliated, that anonymous mob willing to personally attack people over linguistics troubles and scares me.
Full disclosure, being autistic it's not always easy for me to understand the motives of people. Some of you empathize with hypothetical persons that could somehow possibly be offended or slightly inconvenienced by another hypothetical use-case of this language parsing software, not even the software itself. And then those same users are active on Twitter at the very same time they feign their empathy here, personally attacking, insulting and demeaning Github users that did nothing wrong and just created a cool piece of code they wanted to share with everyone. This scares me, it really does. It does not seem like you want to make the world a better place. It seems like you want to attack people and start fights.
Regarding the topic at hand: Gender politics are a complicated field, and nobody here is trying to erase nonbinary genders. This project is merely trying to represent the current mainstream use of the English language, not define it.
This repository is a tool to work with the current state of the English language. Slightly relevant is the Wikipedia Meta Article about "Tendentious editing", especially the Righting Great Wrongs part:
You might think that it is a great place to set the record straight and Right Great Wrongs, but that’s not the case. We can record the righting of great wrongs, but we can’t ride the crest of the wave because we can only report that which is verifiable from reliable and secondary sources, giving appropriate weight to the balance of informed opinion: even if you're sure something is true, it must be verifiable before you can add it.
Wikipedia, as a tool for information, has to abide to the current "mainstream" standard of knowledge, just like a tool for parsing the English language needs to abide to the language's current state to be effective, no matter how "wrong" it may be.
So the first thing to do would be to change the language, then change the code. Not the other way around.
-- some incoherent thoughts from a coder with an unisex name that's used singular they in production code since 2007
@softwareshouldmaybeguessgender there are many people who feel like the SJW:s are starting to have too much influence here, a good example is https://github.com/django/django/pull/2692
Some general observations and pairs of cents.
null
instead of a string without offending anyone?@zahlman
It is not anyone's place to argue that a software feature "doesn't have valid use cases" for another user. People can decide that for themselves
Spot on. Especially since nlp-compromise is not (necessarily) used to identify the gender of a (real) person, but to parse text. A possible use case would for example be to automatically alter a literary work as to replace a male character with a female ("Bob was hungry. He ate an apple." => "Alice was hungry. She ate an apple."), or to understand which character is being referred to ("Bob and Alice had an argument. He wanted to go south, she wanted to go west." Which direction does Bob want to go?).
As soon as we have a common, shared way to incorporate more genders into our speech, this repo will be updated to work with those genders too. As it is now, nlp does in no way diminish the existence of nonbinary genders. It is in no way or form used to profile people, merely to detect if written text uses male or female pronouns. And for alternative pronouns like xe or ze there unfortunately isn't anything that remotely resembles a standard, yet.
We will get there, and software will catch up. It would be interesting to get input from actual nonbinary identifying developers/users on this.
This whole issue is just the most inane thing I've ever seen in my life. I can't believe this is not satire.
I'm laughing and crying at the same time holy shit.
@SharkMachine the Django Slave/Master is a completely different issue in a different context and has to be looked at on its own grounds. While the common factor may be people related to the social justice movement, do not make the same mistake they do in attempting to throw race issues and gender issues in the same arena.
I do understand the resentment, and have felt it myself. Do I think this may be related to social justice? Probably. But the criteria of the issue itself is not social justice and it should not be a "winning" point in your argument. You cannot play by unfair rules while accusing someone of doing the same.
You're viewing this as an "Us" vs "Them" thing which does not contribute to the chances that we might have understanding and peace. Remember, the person on the other side of the debate feels wronged as well. Do not make the mistake of demonizing the multidimensional human being that disagrees with you by throwing them in categories.
If you read back, the words "social" and "justice" were thrown by people in opposition to the opened issue first. This is an attempt to shovel an debate / person in a broad category of ideas that have a general "wrongness" or "rightness" in an attempt to simplify it for an easy dismissal. This puts someone in a position in which they have to then defend against multiple axe-grinded issues at once instead of focusing on the topic at hand, or forces the discussion of whether or not they are associated with a certain group.
Hopefully you realize the irony that this is a type of "kafkatrap" as it forces someone to fight against a floating guilt associated with [ insert group you don't like here ]. This street goes both ways and making the same mistake undoes so many opportunities to point out where wrong may be wrong.
If you really want to make someone take on the defense of a nebulous movement or group, let them mention their "membership" first.
Also do keep in mind that there are productive people associated with social justice that do make useful contributions and do not always stir up conflict like this. Ironically, @scanlime is one of them. They make a lot of cool stuff. Have you seen zenphoton? Micah may or may not be an "SJW", but they are also a person with things to offer this community.
Regardless of what you believe in, keep your politics in your pants when it is not appropriate. It is not appropriate at Github. Should this issue never have been opened? I believe so. But do not bring out your own can of worms just because someone brought their own. Discuss only the issue at hand effectively, directly, fairly, and civilly and you very well may make things better for both parties.
The Cycle Of SJW Inclusivity
Rinse & Repeat.
If you can't find a use for gender in a gendered language then I don't think you are thinking hard enough.
For instance, if you are using Google Now, try these queries in voice:
Who is Ringo Starr?
Then ask..
What band was he in?
Google Now will pick up that by using "he" you are referring to a person in your search history. It will search for the first male gendered person in your search history (Ringo Starr) and give you back information on who he is.
Using they
in this context does not work as it is a plural meaning. If I asked a query before that about multiple people, it adds a layer of ambiguity into the algorithm and makes it less use friendly.
I think the field of HCI needs to move towards being more user friendly, and to do that it has to understand how our language works.
As a consequence to service a majority of people it must identify gender. The only way you can really fix this is with a time machine.
@RRorg no one opposes LGBT representation, spreading that false narrative does nothing but cause further drama into an already drama saturated topic. This is about people calling for the removal of a feature without offering any real alternatives because "it upsets them".
The majority of LGBT people don't care about this (myself included), because they don't use their sexual preference or identity as a symbol to force people to change things they find uncomfortable; nor should anyone. Everyone within my circle who is gay, trans or non-binary cannot fathom why there are so many paper-thin people promoting non-issues. While a serious discussion should be had, I think the main issue here is the fact people want code removed and aren't offering to contribute anything to replace it. At this point though the issue should be locked.
@Codeusa please read the rest of my posts before judging my stance. Still, you cannot deny that there are certain groups on the internet currently exploiting this topic, I have already seen this very thread linked with descriptions like - I quote -
I'm never going to stop making fun of men with girls names and women with boys names. Stfu sjw assholes.
to give just one of the tamer examples. Prejudice is very much alive, unfortunately.
But of course I too feel like a non-issue is being blown out of proportions, this is why I would love to get input from actually affected users that do not fall into the traditional male/female gender definition.
@RRorg I'm not going to deny people in the far right can be unreasonably toxic, nor can you deny people on the far left can be just the same with comments such as
cis fucks are upset their boys club is falling apart
But that is the issue really, radical anything is toxic and those are the types of people who tend to fan the flame on these discussions. When you try to inject personal politics and agendas into software/open source development these are the results you get. They have no place here. It surely doesn't help this thread was brigaded by concern trolls from both sides - but at the end of the day this is a non-issue being painted as one.
The problem is feelings, feelings have no real place in a discussion like this. This is a matter of science, where the facts matter. Not some vague feeling of injustice because something doesn't fit in your world view. I mean, I'm part of the LGBT spectrum as well, so it's not just privileged viewpoint I'm talking from.
I am just gonna quote a 'cis white male' named Bertrand Russell on this because this is just the perfect use case for this quote:
When you are studying any matter, or considering any philosophy, ask yourself only: What are the facts, and what is the truth that the facts bear out. Never let yourself be diverted, either by what you wish to believe, or by what you think would have beneficent social effects if it were believed; but look only and solely at what are the facts. -- Bertrand Russell
Issue is filled with drama and no progress. Issues should be to improve not to detract. Let's work towards a way to determining more genders and better determining genders based on factors other than names, instead of removing gender guessing all together.
This issue should be closed in favor of an issue or pull-request that does that.
There are people being taken as sex slaves by ISIS, and you're on Github complaining about a natural language processor using gendered pronouns.
You make software. You could be making something that makes a difference. I'm not talking about the latest way to order a sandwich with your cell phone, I'm talking about things that could really help people, like tools to help people with diabetes keep track of their diets, or help deaf people by automatically converting speech into subtitles. Instead you're complain about pronouns on Github. Get some perspective.
Gamergate: a guy complains on the internet because his girlfriend cheated on him. People like you accuse him of "slut shaming". 4chan finds out and posts ridiculous stuff on twitter to bait you. It works. Now we have a big issue that CNN is reporting on that's based on absolutely nothing in reality, and people like Anita Sarkeesian figure out how to exploit it and make hundreds of thousands of dollars giving talks about it.
Let me tell you exactly what's going to happen:
Someone is going to post this thread on 4chan. Then you're gonna get a pull request every hour for code that does things like tries to guess your gender, and if it's too ambiguous it tells you to kill yourself or starts saying "Hitler did nothing wrong". This issue post is going to be flooded with people saying gender-fluid people should be gassed while every other issue is going to be flooded with complaints that features that have nothing to do with gender are discriminatory.
This is what happens when you have these kinds of discussions on the internet.
I think it's important to realize the intersections between social justice, utility, and marketing.
On a site like Facebook, it would be pointless and even disrespectful in some cases to assume ones gender based on their name. Many times it would be wrong, and it could even be hurtful to the individual if they were trans or didn't identify with the gender binary.
However, I think it's important to realize that marketing DOES care what gender you are, and that marketing to a specific gender it is effective. Marketing isn't about promoting social justice - in fact, much of the time it exploits biases and arbitrary social constructs.
Also consider the following - what if I ran a website where 90% of my users were male? I could never figure out how to be more inclusive if not for useful analytics that determined a good estimate of gender.
It's silly to demand this tool be removed. It is useful, just make sure you're using it in the right context.
Consider this: You're complaining about how an open source project (that someone graciously offered for download at a zero price) makes you feel. Furthermore, you feel entitled to make demands about its functionality without (to my knowledge) opening a pull request or forking it yourself. That's a pretty good symptom of how the modern left has collapsed as a moral (or rational) system.
Something like 99.7% of the population identifies as male or female (http://williamsinstitute.law.ucla.edu/research/census-lgbt-demographics-studies/how-many-people-are-lesbian-gay-bisexual-and-transgender/). Even transgender men and women still prefer to be referred to as man or woman. There clearly is a use case for trying to determine gender from language context. Gender is a significant predictor of many qualities about individuals and is resultantly used in many applications. There are many easily observable differences, both behavioral and physical, between the vast majority of men and women. What is the point of marketing tampons to men? There will always be use cases for differentiating by gender because it will always, at a bare minimum, be a significant predictor of physical differences. Of course the word "predict" implies less than 100% accuracy, however, this does not make predictions useless.
This issue is a complete joke. I'd recommend closing it as there is no useful discussion to be had and the obvious trolls are showing up from both sides.
Yes, this thread was a mess from the very start and there's no way it's going to get better. Strongly suggest the project maintainer close and lock this unproductive issue and put it out of mind. It might be a good idea to block the people who proposed and advocated it, as well -- they don't appear to be users of or contributors to this project, and if you leave them be you're just going to get more drama.
OH NO! Gender misinterpretation! As a Third World citizen, I can't help but chuckle at the things some people consider so problematic.
As the first commenter, I just want to say: stop brigading this issue. Pretty much everything has been discussed and I'm honestly mad that people are still liking/disliking my comment because that's not contributing to this. Obviously this is a choice up to the maintainer and all of these new accounts are not helping.
Unsubscribing from issue, I vote to lock and close. I agree with @PaulBGD
hi Micah and others, thanks for your interest and criticism, as well as your patience. The feature was brought in to enable pronoun resolution - which is really handy for interpreting when a subject is mentioned more than once in a text. Luckily, it's the only place in english (i think) where these old-fashioned gender concepts appear.
I agree it would be obtuse to use it, or similar methods, for profiling users, or anything like this.
@vielmetti is also correct in noting that this is a pretty-frequent classification-problem in computer science - and that my implementation is completely half-baked ;) I appreciate that it's a poor subject to handle so coarsely. I didn't anticipate this attention, but understand it.
I'm going to keep the functionality for now, and also close this ticket. I welcome help in giving it a more modern sensibility. I really want to see this project as something universally useful. Thanks again,
Can I recommend changing the documentation to make references to this feature less prominent?
I feel that there are legitimate academic uses for this feature, but that it would be harmful to use in a real-world application.
In this case, I feel like the documentation for this portion of the API belongs under a separate "Experimental" header explaining some of the pitfalls of automatic gender detection, as well as the inherent problems related to assuming binary gender, and other concerns about building gender-specific behavior into applications.
Software developers generally consider it a good practice to warn API consumers about behaviors that might endanger users' privacy. I don't think that it's any different to put a similar disclaimer about the social pitfalls of a feature like programatic gender detection.
This is a human-factors issue that many well-intentioned developers have not considered. There is absolutely no cost to being a "good citizen," and providing a small disclaimer about the pitfalls of this feature.
People shouldn't guess gender and gendered pronouns. Neither should software. Would you write code to guess someone's race using heuristics?