plk / biblatex

biblatex is a sophisticated bibliography system for LaTeX users. It has considerably more features than traditional bibtex and supports UTF-8
508 stars 117 forks source link

Feature request: Alternative author names? #1094

Open alexreg opened 3 years ago

alexreg commented 3 years ago

I was wondering what your thoughts would be on adding support for alternative versions of an author's name. This is admittedly not a super common use case, but might come in hand when, e.g.,

In some cases it may need to be indicated which names are alternatives/aliases for other names (referring to the same person, that is) whereas in other cases this would be unnecessary (e.g. just middle initials missing).

The main practical relevance of this feature, I imagine, would be to control unique citations ("citation counter"), grouping of bibliographic entries, and in some cases showing extra text elaborating on the real name corresponding to a pseudonym / pen name.

pauloney commented 3 years ago

And how would you deal with two names for the same author where he does NOT want to be recognized as the same author -- like for example in the change of sex cases.

Paulo Ney

On Fri, Jan 15, 2021, 8:07 PM Alexander Regueiro notifications@github.com wrote:

I was wondering what your thoughts would be on adding support for alternative versions of an author's name. This is admittedly not a super common use case, but might come in hand when, e.g.,

  • An author publishes some works before a change of name and some after (whether a change of first name or maiden name vs. married name).
  • One publication uses middle initials for an authors name while another doesn't.
  • Pseudonyms and pen names

In some cases it may need to be indicated which names are alternatives/aliases for other names (referring to the same person, that is) whereas in other cases this would be unnecessary (e.g. just middle initials missing).

The main practical relevance of this feature, I imagine, would be to control unique citations ("citation counter"), grouping of bibliographic entries, and in some cases showing extra text elaborating on the real name corresponding to a pseudonym / pen name.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/plk/biblatex/issues/1094, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR7WYQGFXVJ5KCLJDTU2QLS2EGGNANCNFSM4WE57Q2A .

alexreg commented 3 years ago

That's neither here nor there. The importance of this suggestion is to allow the possibility of identifying authors as the same person despite different names (implicitly through citation style, or explicitly through bibliographic notes).

moewew commented 3 years ago

Agreed. It doesn't really matter if there are situations where you explicitly don't want this: You could choose not to use the additional features in that case. There is also the philosophical issue of how you know that two slightly different names refer to the same person and the bibliographic question of whether or not you should try to stick to the exact name format as given in the publication you cite or should try to strive for maximum consistency in your document (grouping pen names together with the real name may confuse readers who don't know about the relation). But we can probably ignore that for now and just assume that some people would find such a feature useful and would want to use it.

The much more important question is what the interface should look like. I'm guessing you could actually get quite far with sortname, the extended name format or name/field annotations already, but I'm guessing some people would find that too clunky. I'm at a loss to know what a "good" interface would look like.

Just to show what is possible at the moment, here is one way to normalise away a middle name initial for sorting and citations (the different forms will show in the bibliography)

\documentclass[british]{article}
\usepackage[T1]{fontenc}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[backend=biber, style=authoryear]{biblatex}

\DeclareSortingTemplate{nyt}{
  \sort{
    \field{presort}
  }
  \sort[final]{
    \field{sortkey}
  }
  \sort{
    \field{sortname}
    \field{shortauthor}
    \field{author}
    \field{shorteditor}
    \field{editor}
    \field{translator}
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{sortyear}
    \field{year}
  }
  \sort{
    \field{sorttitle}
    \field{title}
  }
  \sort{
    \field{volume}
    \literal{0}
  }
}

\begin{filecontents}{\jobname.bib}
@book{belk,
  author    = {Anne B. Elk},
  shortauthor = {Anne Elk},
  title     = {A Theory on Brontosauruses},
  year      = {1972},
  publisher = {Monthy \& Co.},
  location  = {London},
}
@book{elk,
  author    = {Anne Elk},
  title     = {A Theory on Brontosauruses},
  year      = {1972},
  publisher = {Monthy \& Co.},
  location  = {London},
}
\end{filecontents}
\addbibresource{\jobname.bib}
\addbibresource{biblatex-examples.bib}

\begin{document}
Lorem \autocite{sigfridsson,elk,belk}

\printbibliography
\end{document}
pauloney commented 3 years ago

And how do we deal with two different people that have exactly the same name? There are plenty of examples...

I think the only way to deal with this problem will be to assign an ID to each author in the source bib.

That would even allow us to treat the case of an author that publishes under two different names and one needs to preserve the name used in the citation. Like for example a "Marco" who is now a "Teresa" and wants it known.

Paulo Ney

On Sat, Jan 16, 2021, 11:24 PM moewew notifications@github.com wrote:

Agreed. It doesn't really matter if there are situations where you explicitly don't want this: You could choose not to use the additional features in that case. There is also the philosophical issue of how you know that two slightly different names refer to the same person and the bibliographic question of whether or not you should try to stick to the exact name format as given in the publication you cite or should try to strive for maximum consistency in your document (grouping pen names together with the real name may confuse readers who don't know about the relation). But we can probably ignore that for now and just assume that some people would find such a feature useful and would want to use it.

The much more important question is what the interface should look like. I'm guessing you could actually get quite far with sortname, the extended name format or name/field annotations already, but I'm guessing some people would find that too clunky. I'm at a loss to know what a "good" interface would look like.

Just to show what is possible at the moment, here is one way to normalise away a middle name initial for sorting and citations (the different forms will show in the bibliography)

\documentclass[british]{article}\usepackage[T1]{fontenc}\usepackage{babel}\usepackage{csquotes} \usepackage[backend=biber, style=authoryear]{biblatex} \DeclareSortingTemplate{nyt}{ \sort{ \field{presort} } \sort[final]{ \field{sortkey} } \sort{ \field{sortname} \field{shortauthor} \field{author} \field{shorteditor} \field{editor} \field{translator} \field{sorttitle} \field{title} } \sort{ \field{sortyear} \field{year} } \sort{ \field{sorttitle} \field{title} } \sort{ \field{volume} \literal{0} } } \begin{filecontents}[overwrite]{\jobname.bib} @book{belk, author = {Anne B. Elk}, shortauthor = {Anne Elk}, title = {A Theory on Brontosauruses}, year = {1972}, publisher = {Monthy \& Co.}, location = {London}, } @book{elk, author = {Anne Elk}, title = {A Theory on Brontosauruses}, year = {1972}, publisher = {Monthy \& Co.}, location = {London}, }\end{filecontents}\addbibresource{\jobname.bib}\addbibresource{biblatex-examples.bib} \begin{document} Lorem \autocite{sigfridsson,elk,belk} \printbibliography\end{document}

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/plk/biblatex/issues/1094#issuecomment-761747366, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR7WYXLP5JJSYG6CFDTQO3S2KGDRANCNFSM4WE57Q2A .

moewew commented 3 years ago

All good points. But, @pauloney, forgive me if I misrepresent your intentions, I initially interpreted you first comment to mean that you in general think this request is not a good idea. From what I read now though, I'm not too sure any more. If one can come up with a good interface to address this (and that is a big if, no doubt), ideally the solution would 'cut both ways': You would be able to group together different names for one person and 'split' one name for several people (how useful or usable this is for your readers is a question you'd have to address at some point). A badly designed system that can only deal with a fraction of real world cases is probably not going to be very helpful, but if we can come up with a good system that is fully backwards compatible, works for most of the uses cases that may come up in the wild (including the cases you mentioned) and has a at least somewhat usable interface, it may be helpful.

pauloney commented 3 years ago

I definitely think that it is an excellent good idea. We dance around this issue all the time in our production line, and making choices which are not always smart -- like for example changing the name of the author as it appears in the original publication.

To give you an idea of the vastness of the problem, today we are dealing with the citation to a paper by

  M. M. Peixoto & M.M. Peixoto

in which one of them is Marília Matos and the other is Maurício Matos and NO one knows which if the first and which is the second.

Knowing exactly which one is which is important for things like -- search for other papers of the same author -- which are going to be present on the web-page of the paper and whose data is derived from the files produced by Biber, from the source in BibTeX.

I think we should have this implemented years ago ... but it needs to be done carefully and, if possible, encompass all the cases... unlike the language/transliteration case where we still do not have a correct way to cite certain types of work.

We should list all the problems first and make sure we come with a framework that could address all of them and possibly be extended in the future.

Paulo Ney

On Sun, Jan 17, 2021 at 11:59 PM moewew notifications@github.com wrote:

All good points. But, @pauloney https://github.com/pauloney, forgive me if I misrepresent your intentions, I initially interpreted you first comment to mean that you in general think this request is not a good idea. From what I read now though, I'm not too sure any more. If one can come up with a good interface to address this (and that is a big if, no doubt), ideally the solution would 'cut both ways': You would be able to group together different names for one person and 'split' one name for several people (how useful or usable this is for your readers is a question you'd have to address at some point). A badly designed system that can only deal with a fraction of real world cases is probably not going to be very helpful, but if we can come up with a good system that is fully backwards compatible, works for most of the uses cases that may come up in the wild (including the cases you mentioned) and has a at least somewhat usable interface, it may be helpful.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/plk/biblatex/issues/1094#issuecomment-762060213, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAR7WYXWD74JTDCGOZC44DLS2PS6DANCNFSM4WE57Q2A .

alexreg commented 3 years ago

Sorry for the very slow reply!

@moewew

Agreed. It doesn't really matter if there are situations where you explicitly don't want this: You could choose not to use the additional features in that case. There is also the philosophical issue of how you know that two slightly different names refer to the same person and the bibliographic question of whether or not you should try to stick to the exact name format as given in the publication you cite or should try to strive for maximum consistency in your document (grouping pen names together with the real name may confuse readers who don't know about the relation). But we can probably ignore that for now and just assume that some people would find such a feature useful and would want to use it.

Yes, this sounds reasonable. We could at least start by offering the option of real_name / display_name for an author, that would make biblatex group both bibliographic entries and citations according to that name and not the published name (which would be input using the current syntax, I suppose). This would override the usual name in so far as grouping and sorting and the principal way in which the name is displayed, though I would imagine we would at least want the option of displaying the published (original) name in brackets in the bibliographic entry. Further customisation over these things could always come later.

The much more important question is what the interface should look like. I'm guessing you could actually get quite far with sortname, the extended name format or name/field annotations already, but I'm guessing some people would find that too clunky. I'm at a loss to know what a "good" interface would look like.

I believe I'm thinking of the same or a very similar interface (although I prefer real_name / display_name over sort_name, probably... small matter). The existing extended name format is probably a good way to do this. One could enhance this as such, potentially:

author = {given = Hans, family = Harman, real_given = Johannes}
author = {given = Hans, family = Harman, real_full = Johannes Schmidt}
author = {full = Hans Harman, real_given = Johannes}
author = {full = Hans Harman, real_full = Johannes Schmidt}

@pauloney

This is a good point about individuals with the same surname and identical initials (or perhaps even identical names if you're really unlucky, though I think there are other solutions in that rare case, like suffixes). I'm not sure the suggestion here is actually the best way to solve this problem, but it could be used that way. In an ideal world, biblatex would detect ambiguous names and differentiate by automatically expanding a first or middle name where necessary (in all appearances throughout the document, I would think).

plk commented 1 year ago

I think all the mechanisms to do this are already present. Declare some new nameparts with:

\DeclareDatamodelConstant[type=list]{nameparts}{prefix,family,suffix,given,realgiven,realfamily}

then declare sorting schemes using these new nameparts and adapt a style for printing the new nameparts in whatever format you need. This is not really a very common requirement for most users and so it could just be an extension you add in to your own workflow.

alexreg commented 1 year ago

@plk Thanks for your reply. That's good to know. I am trying to refresh my memory as to my exact use case (since it was so long ago), but I suspect you are right in that this should solve at least many of the cases I proposed.

plk commented 1 year ago

A large component missing to generally solve cases like this was name hash customisation. This is now implemented in biblatex 3.20 DEV (requiring biber 2.20 DEV), both on Sourceforge. See the documentation , section 4.11.5 on name identity. There are two ways to address this now, one more aimed at the discussion here where the names may change radically. The solutions allow biblatex to treat any names as "the same" for hashing purposes, which determines a lot of things like citation compression, extra* generation etc. See also the examples and discussion in this issue: #1274

plk commented 1 year ago

For example, in the M. M. Peixoto & M.M. Peixoto case, as desired, there is now (currently in DEV) the facility in extended name format to provide Ids which will be used to override name-derived data for hash generation:

AUTHOR = {id=peixoto1, given={M. M.}, family=Peixoto and id=peixoto2, given={M. M.}, family=Peixoto}

Now, each author is treated as a different person for things like citation compression, extra label in citation, dash elisions in the bibliography etc. Do try it out and let us know if there is anything that doesn't work as expected.