JabRef / jabref

Graphical Java application for managing BibTeX and biblatex (.bib) databases
https://devdocs.jabref.org
MIT License
3.61k stars 2.57k forks source link

Author initials not followed by dashes are trimmed by the Authors formatter #9384

Open pedrogit opened 1 year ago

pedrogit commented 1 year ago

JabRef version

Latest development branch build (please note build date below)

Operating system

Windows

Details on version and operating system

JabRef 5.8--2022-11-17--e6fc296 Windows 10 10.0 amd64 Java 18.0.2.1 JavaFX 19+11

Checked with the latest development build

Steps to reproduce the behaviour

1 - Go to "Options->Entry preview", click on "Customized preview style". Edit it and replace all the preview code with:

\begin{author}{\format[Authors(LastFirst,InitialsNospace,FullPunc,And)]{\author}}\end{author}

Make sure "Show preview as a tab in entry editor" is not checked.

2 - Create a new library and a new article. Assign "First-Firstname Secondname" to the author field. Hit F9 one or more times to display the reference in "Customized preview style". It should display "{Secondname, F.-F.}". The "InitialsNospace" option works fine and the first name (First-Firstname) is abreviated properly.

3 - Now replace "First-Firstname Secondname" in the "Author" field with "Secondname, F.-F.". The firstname, containing a dash (-) is already abbreviated and the "InitialsNospace" option properly display "{Secondname, F.-F.}" which is also fine.

4 - Now replace "Secondname, F.-F." in the "Author" field with "Secondname, F.F.". The firstname is already abbreviated from "First Firstname" (without dash which is very common in latin languages) to "F.F." and there is no need to re-abbreviate it. The second initial should be kept: "Secondname, F.F." since it is part of the already abbreviated firstname.

Appendix

Documentation about how to create/modify the Customized preview style is here:

https://docs.jabref.org/collaborative-work/export/customexports

ThiloteE commented 1 year ago

As I understand it, it is expected that firstnames should be trimmed to F., because of InitialsNospace. InitialsNospace specifies how the author names are abbreviated: as Initials, with any spaces between initials removed. Naively I then would expect that "First-Firstname" would be abbreviated as "F.".

Of course languages are sometimes very weird and have many exceptions. In German language we have some double names like "Klaus-Dieter", so trimming those names to K.-D. would probably be correct, therefore the example you give F.-F. might be correct as well, albeit It being somewhat out of the ordinary. Naively, I would also have expected F.-F. to be trimmed to F.

Since this does not happen, what we can learn from this is that separators like - are taken into account by JabRef during abbreviation and if there is a separator, JabRef detects what comes after the separator as new firstname.

If we stick with the example of Klaus-Dieter, just imagine what K.D. really would mean unabbreviated: That would be Klaus.Dieter., which, if it were a real name would go against most naming conventions. There IS a double-name separator, namely the ., but this is a very unusual double-name separator. I see no point in bending over backwards changing JabRef code just to introduce an exception to InitialsNoSpace, whose sole reason is to abbreviate names... The way to fix this would be to create a rule for JabRef to detect dots . as double name separators, which might lead to unintended consequences when People enter their titles e.g. Dr. or Prof. or whatever (there is also et al.) into the name field...

Having done a short search for names on the web, when I enter . into https://www.momjunction.com/baby-names/search/ it does not find a single name with a dot, whereas I can find \~3500 with -. Looking at https://www.ssa.gov/oact/babynames/limits.html, there are no names with - or . either.

Anyway, my personal opinion is that this issue is a severe edge case that JabRef maintainers should put no effort into fixing. There are way more important issues around. Sorry :/

pedrogit commented 1 year ago

I edited my first description as there was, I think, a misunderstanding.

The "InitialsNoSpace" option abbreviates firstnames like "Klaus Dieter" to K.D. which is correct and conform with most journals author list of names. the problem is when it re-abbreviate already abbreviated firstnames,

But in my example F.F. is already abbreviated. K.D. would be the abbreviation of Klaus Dieter without dash. Such first names are very common in some latin languages (in English as well). The "InitialsNospace" option should not re-abbreviate K.D. sinceit is already abbreviated.

I guess it must be a simple omission in some regular expression. If you point me to the code responsible for the "InitialsNoSpace" option I could propose a fix (I'm very good with regular expressions!).

Siedlerchr commented 1 year ago

@pedrogit Thanks for your interested. I think the relevant classes are probably here https://github.com/JabRef/jabref/blob/main/src/main/java/org/jabref/logic/layout/format/Authors.java There the options are handled

There is also a Unit test: https://github.com/JabRef/jabref/blob/aeddfd085136d515864d46c5e54313ad3f55d527/src/test/java/org/jabref/logic/layout/format/AuthorsTest.java#L155-L168

(If you want to test it locally, we recommend setting up the workspace as described here https://devdocs.jabref.org/getting-into-the-code/guidelines-for-setting-up-a-local-workspace)

svenjaeger commented 1 year ago

In BibTeX it is not intended to separate multiple names only with a dot. The BibTeX and Biblatex styles that print K.D. as K.D. do so only because they are designed not to abbreviate names. All abbreviating styles I know would print K.D. only as K., while K. D. would be printed as K.D. or K. D, depending on the style. I am against being more generous in Jabref here because this would create the expectation that a similar result would be achieved by BibTeX. I don't think it's unreasonable to expect users to enter a space here.

koppor commented 7 months ago

The article Names in BibTeX and MlBibTeX nicely shows how names are formatted in BibTeX.

koppor commented 7 months ago

This refs https://github.com/JabRef/jabref/issues/4558 (because in BibLaTeX, the names can be expressed more precisely)