Closed JohnSmithDev closed 5 years ago
The right author id and name comes back from get_definitive_authors(), so it's something in analyse_authors_by_gender()
Getting closer: author_gender.get_author_gender_from_ids_and_then_name_cached(). Have now created a test case that fails.
This calls get_author_aliases() which returns the 2 names, but Judith Lessing is first. get_author_aliases() does have logic to return the names in resemblance/relevance order first, but that only applies if a textual name is provided, and we are passing in a numeric author_id.
Perhaps get_author_gender_from_id_and_then_name should use the passed name first, and only if that fails, use the aliases? That function used to support multiple IDs being passed (which was something I probably never did in practice), but now it only accepts a single ID, and we can assume that the ID being passed in is probably the best one?
Amazing - now that I provide the correct issue number in the commit message, I seem to have used a format that hasn't been picked up.
One minor niggle - I'm curious why this was doing the right thing a week ago for the initial launch of the gender project, but what exactly got changed in the meantime to cause this regression.
Noticed by chance when comparing an older chart for Tiptree award with my new code.
In 1997 Paul Witcover was listed as a Tiptree finalist/nomination:
http://www.isfdb.org/cgi-bin/ay.cgi?43+1997
His personal page implies that Paul Witcover is his real name, but he used Judith Lessing as a Pseudonym:
However, it seems the latest code is picking up the pseudonym:
(book_scraping) isfdb_tools $ ./award_gender_report.py -W "James Tiptree, Jr. Award" -C "Gender-bending SF" -y 1997 ... WARNING:root:No Twitter link(s) for Paul Witcover 1997 : F : Paul Witcover : human-names:Judith Lessing
Curiously the author_gender.py script gets the right answer:
(book_scraping) isfdb_tools $ ./author_gender.py -A "Paul Witcover" WARNING:root:No Twitter link(s) for ['Paul Witcover'] WARNING:root:Not able to get gender using author_ids [3161, 108589] (ref=['Paul Witcover']) - will try to get gender from name instead M (source: human-names)
The awarded work seems to have only ever been credited to Paul Witcover:
http://www.isfdb.org/cgi-bin/title.cgi?8616