internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.18k stars 1.35k forks source link

How should pseudonyms be handled? #1594

Closed seabelis closed 5 years ago

seabelis commented 5 years ago

I'm wondering how pseudonyms should be handled. I know there is a way to add an alternate name to an author's profile, but that seems to be for misspellings, foreign spellings, format variants, etc. Those do not seem to be searchable from the author field. So I'm wondering if a legitimate pseudonym (i.e. Richard Bachman for Stephen King or Ellis Bell for Emily Brontë) should be listed as a second author on relevant works or author alias only.

LeadSongDog commented 5 years ago

One traditional option is for the by_statement to show "Stephen King writing as Richard Bachman". Another is for the work to be attributed to the pseudonym, but redirect the pseud to the main.

https://www.wikidata.org/wiki/Q3495759 (Richard Bachman) has property P31 (instance of) Q127843 (pen name) qualified by property P642 (of) with value Q39829 (Stephen King). https://www.wikidata.org/wiki/Q39829 (Stephen King) has properties P1477 Birth name (value Stephen Edwin King) and P742 pseudonym (value Richard Bachman)

http://www.isni.org/isni/0000000120213263 shows the identity "Richard Bachman" (rather than the author using that pseud), and it lists under related names "King, Stephen (1947-; see also from)" The complimentary http://www.isni.org/isni/0000000121446296 shows "Bachman, Richard (Pseudonym)" along with "Stephen King", "Στίβεν Κινγκ" and numerous variations.

In any event, the various entries under "alternate name" on an OLxxxA record are supposed to be searchable, though there have at times been problems doing so.

seabelis commented 5 years ago

https://openlibrary.org/search?q=title%3Athinner+author%3Abachman&mode=everything

tfmorris commented 5 years ago

Historically, OpenLibrary has broken with traditional library cataloging practice in merging pseudonyms as alternate names for the actual author (like using normal name order instead of inverted).

You could probably find documentation of this in one of the (many) email archives. ol-tech might be a good starting point.

tfmorris commented 5 years ago

Here's one comment that I wrote in 2010: https://www.mail-archive.com/ol-tech@archive.org/msg00082.html not 100% on pseudonyms, but pretty close.

It's pretty easy to be for or against Richard Bachman, but the discussion starts becoming quickly more complex when one considers Nicolas Bourbaki, house authors (e.g. romance novel industry), etc.

seabelis commented 5 years ago

I suppose I'm looking at it from the perspective of the end user's ability to find the book by either name. It seems it should be possible. It makes perfect sense to use an alias if it's searchable. Right now it is not.

LeadSongDog commented 5 years ago

The reason that query didn't find it was that https://openlibrary.org/authors/OL413806A/Richard_Bachman does not list Alternate names.

seabelis commented 5 years ago

I see. So it's the pseudonym profile that must have the author's real name as an alias. I was expecting otherwise.

tfmorris commented 5 years ago

Actually, under our current model, the two records should just be merged which would take care of listing the "losing" name as an alternate.

LeadSongDog commented 5 years ago

The problem with the current model is that after that merger, the OL record leaves no hint which name was used on the physical book, and the one shown may not agree with the MARC originally imported. We need to capture the "written as" pseud at least in the edition notes or better in the by_statement. Of course it would help if the by_statement was being displayed, but that's a different issue(#777).

seabelis commented 5 years ago

Is "Stephen King And Peter Straub" really a valid "alternate name" within the context of OL or should that be removed? These are two different writers; it's not really an alternate to "Stephen King."

LeadSongDog commented 5 years ago

No, it is not valid. We might learn something from examining the record that was originally imported. In one case: https://openlibrary.org/show-records/amazon:0007100426 shows one conflation of the two names, but Azon had lots of them in 2008.

seabelis commented 5 years ago

Okay, so deletion of such things is okay. I wasn't quite sure if this field is meant to be used for any and all possible ways the name might be stated or for actual valid alternates such as foreign spellings.

LeadSongDog commented 5 years ago

Just as long as there isn’t any record linked to it (book or work). Try a general search on the OLxxxA field first to check.

tfmorris commented 5 years ago

The author merges which created v16 & v17 merged in conflated author records. Those records should have been deleted and the works edited to replace the authors with the correct ones.

Deleting the bad names is perfectly acceptable and probably the easiest/best solution at this point.

BrittanyBunk commented 2 years ago

I wish there was a specific box on an author's page for their pseudonym. On the author field of a book, there should be a form field for aliases.

BrittanyBunk commented 2 years ago

@LeadSongDog by statement wouldn't work - that's something else

LeadSongDog commented 2 years ago

“By Samuel Clemens, writing as Mark Twain" would seem quite an appropriate use, what am I missing?.

BrittanyBunk commented 2 years ago

@LeadSongDog a by statement and a psudonym are two different things. Unfortunately when I went to google what a by-statement is, ironically it gave me the webpage to issue #777 in the Open Library Github.

I don't have a definition from the internet, but my understanding is a by statement is usually a note or dedication, like 'for my sister' or something. A by-statement has too many variations, so it doesn't work.

Another issue that doesn't work is how you wrote it: "By Samuel Clemens, writing as Mark Twain" This is very ambiguous. From this statement, at face value, without knowing the answer, I wouldn't know which is the psudonym and which is the name of the writer.

Also, judging by issue 777, it seems like there were issues with the by statement form field. Also it's not where the author name is located - so if there's multiple authors, it's hard to find, and also it's not where the authors names are - and it's kind of something seen only on the librarian end.

That's why you'd need a pseudonym form field next to the author's name itself - so it's clear. People shouldn't have to write it in like you have it - that's too much variation and ambiguity for anyone to understand - in my opinion of course. I just know I wouldn't understand it or know where to find it, if I can at all.

You see what I'm saying?

I guess one more that's the 'nail in the coffin' is that if no search engine has a clear definition of what a 'by statement' is, should we really be using it? I found https://quizlet.com/question/which-statement-is-true-regarding-a-by-statement-in-a-reporting-procedure-such-as-proc-print-3614346001332026943 , but I don't think that's the definition used here.

Sorry if I put too many reasons and overloaded the conversation here - you just asked and I answered what I know.

LeadSongDog commented 2 years ago

A by statement is OLspeak for a "Statement of Responsibility". Anywhere else, it reflects at least the work’s author(s) as shown in MARC 245$c, and possibly contributors to a particular manifestation (at the cataloguer’s discretion).

As Tom noted, OL takes a nonstandard approach to pseudonyms, combining all identities of a person into one, irrespective of the one used on a particular work.

BrittanyBunk commented 2 years ago

@LeadSongDog As I see it, it seems that the by statement is the location of pseudonyms according to @seabelis in #2654. To quote: "The byline is useful as the author's name on the edition is not always the same as on the work." This is a really inappropriate location. Simply because it's not labeled as such.

I don't mind that all the identities of a person be lumped as one. That's fine - that's what the author page is for. The issue is the differentiation on the edit pages of editions associated with the by statement field.

What's really sad is all the places in github where I already mentioned these issues were closed - so they weren't worked on. Now people are bringing them up again, because they're still causing problems. The issue of pseudonyms was mentioned by me, for instance in #2786.

My solution:

LeadSongDog commented 2 years ago

Can’t agree. The source records statement of responsibility captures the attributions on t.p. or verso. We should preserve that as a key characteristic of the edition, if for no more reason than to determine which records are redundant. That’s why I raised #777 four years ago after the unfortunate introduction of the so-called canonical page form.

The pseuds sometimes used are a side issue.

BrittanyBunk commented 2 years ago

@LeadSongDog we can agree to disagree on the by statement's status on the website, but at the very least we continue to improve the handling of pseudonyms, as we both a agree - the by statement is a tangential issue to that.