Closed retorquere closed 8 years ago
I already wrap and
in braces (don't I?). So if I understand you correctly, names are treated differently from titles, in that they are not downcased by the bib processor then (otherwise caps protection would still be required to disclose user intent). But then why should one not just always do {Lastname}, {Firstname}
? Seems much easier than the current caps preservation. In fact that then would maybe be the broader solution to preserve caps = all
; just wrap the entire field in braces, no?
And single-field names are also already enclosed in braces, right? So does this question then boil down to "don't caps-project name fields"?
But then why should one not just always do
{Lastname}, {Firstname}
?
To reduce clutter …
I already wrap
and
in braces (don't I?).
Not in publisher
, location
, etc. But if that’s easier, just wrap the entire field in braces.
So does this question then boil down to "don't caps-project name fields"?
It’s both about “don’t caps-protect anything except titles”, and “and-protect any literal ‘and’ biblatex might else parse as a literal-list separator”. Wrapping entire literal-list fields in braces would caps-protect them, too – which doesn’t hurt, but the aim is only to “and-protect” them.
Ah, I indeed don't protect publisher, location, etc for and
s. I didn't know they required that. But in any case, I'd only wrap the entire field if the user has selected the "all" version of preserve caps, not for inner.
Can you elaborate the etc
when it comes to fields that require and
protection?
WRT the preserveCaps, I currently have it on:
Which should go?
Can you elaborate the etc when it comes to fields that require
and
protection?
All biblatex literal list fields: institution, organization, publisher, location, origlocation, origpublisher, address and school (see biblatex manual, “2.3.4 Literal Lists”).
WRT the preserveCaps, I currently have it on: … Which should go?
That list is a bit of a mix of CSL and biblatex?! – I’d keep only those fields whose name contains “title”, (except journaltitle and journalsubtitle) plus “series”.
Sorry, I have Zotero-internal names in the list (place and conferenceName). I've updated the list.
So journal(sub)title doesn't get protection?
So to be clear, author = {von Hicks, {III}, Michael},
should always be author = {von Hicks, III, Michael},
?
Instead of me just copy-pasting the lot here, could you look through https://travis-ci.org/ZotPlus/zotero-better-bibtex/jobs/85899312 to see if these are indeed all desired consequences?
https://travis-ci.org/ZotPlus/zotero-better-bibtex/builds/85905734 has only the diffs with caps protection removed. Could you go through that? It's a pretty big behaviour change.
https://travis-ci.org/ZotPlus/zotero-better-bibtex/builds/85905734: Most of this looks good.
https://travis-ci.org/ZotPlus/zotero-better-bibtex/jobs/85905735
Actual:
+ institution = {Royal Veterinary and Agricultural University, Department of Animal Health and Animal Science, Division of Ethology and Health},
Expected:
+ institution = {Royal Veterinary {and} Agricultural University, Department of Animal Health {and} Animal Science, Division of Ethology {and} Health},
https://travis-ci.org/ZotPlus/zotero-better-bibtex/jobs/85905736
Actual:
title = {Problèmes d’organisation de l’{Administration} {[}1966-1967]},
Expected:
title = {Problèmes d’organisation de l’Administration {[}1966-1967]},
(non-English titles never need protection – I trust there’s some reason for using {[}
?!)
https://travis-ci.org/ZotPlus/zotero-better-bibtex/jobs/85905737
ok
https://travis-ci.org/ZotPlus/zotero-better-bibtex/jobs/85905738
Actual:
+ publisher = {MIT Press ZKM/Center for Art and Media in Karlsruhe},
Expected:
+ publisher = {MIT Press ZKM/Center for Art {and} Media in Karlsruhe},
So I'm getting back to this now that #385 is done. Looking over all these changes, if {X and Y}, {Firstname}
is effectivley the same as X {and} Y, Firstname
, that format would make name formatting dramatically easier. Is it the same? Because if so, that would have my preference at this point.
I’d say these two are equivalent [EDIT: no, they are not, see below] – but note that Firstname
never needs curly braces, it’s just literal and
s in ‘name lists’ and ‘literal lists’ that need protection – using X {and} Y
.
Corporate authors and editors – i.e., Zotero’s single-field names – on the other hand must always be wrapped in an extra pair of curly braces to prevent data parsing from treating them as personal names which are to be dissected into their components.
But is it safe to wrap firstnames in braces? Looking to simplify the output algorithm, and bracing it would automatically handle edge cases.
Actually, the two forms are not equivalent if two or more first names are wrapped in curly braces:
in styles that abbreviate first names, author = {Doe, John Paul}
is rendered as “Doe, J. P.” whereas author = {Doe, {John Paul}}
is rendered (incorrectly!) as “Doe, J.”.
So first names must not be wrapped in curly braces.
What are the edge cases you are worried about?
I'm trying to find out whether it is always safe to use author = {{Lastname}, Firstname, suffix and {Other}, Firstname}
. I don't know of any sensible samples, but if suffixes or firstnames can plausibly include and
or ,
this wouldn't work.
Still, this algorithm might work:
{Lastname, Firstname, suffix}
Does that sound reasonable?
I'm looking at your comment again; if I understand correctly, institution
and publisher
are name-ish fields, and the comment about title
is really for #383.
Would it be possible for you to assemble new test cases specifically for this issue? It would be cleaner than discussing the impact on existing cases. I'll deal with the existing cases when the cases specifically for this issue (and thus only relating to name-ish fields) pass.
“… name-ish …”
Sort of: From the biblatex manual: “The Biblatex package implements three distinct data types to handle bibliographic data: name lists, literal lists, and fields.” – institution
, organization
, publisher
, location
, origlocation
, origpublisher
, address
and school
are literal lists, so literal “and”s must be protected as {and}
, but biblatex is not trying to parse literal list elements into first, last, etc.
“Would it be possible for you to assemble new test cases specifically for this issue?”
Yes, but I won’t be able to do much before the weekend.
Still, this algorithm might work:
My comments below refer to Zotero two-part name fields; single-part name fields should just be wrapped in extra curly braces, no other parsing required.
- Split name into Firstname, Lastname, and suffix (optional)
For biblatex: If any of the primary creators’ firstname fields in Zotero contains !,
, add juniorcomma=true
to the biblatex entry’s options
field.
- Convert each to LaTeX
ok
- Lastname = {Lastname}
Why not like 4. firstname and 5. suffix?
+ 3a. Only if in Zotero the lastname is wrapped in double quotes, wrap bib(la)tex lastname in curly braces.
- Firstname = Firstname.replace(/\band\b/, '{and}').replace(',', '{,}')
- suffix = suffix.replace(/\band\b/, '{and}').replace(',', '{,}')
ok, but do the same for non-dropping particle and dropping particle
- output {Lastname, Firstname, suffix}
output {dropping-particle non-dropping-particle Lastname, suffix, Firstname}
For biblatex: If any of the primary creators’ non-dropping particles is non-empty, add useprefix=true
to the biblatex entry’s options
field.
For bibtex it might make sense to output {dropping-particle {non-dropping-particle Lastname}, suffix, Firstname}
.
If treating firstname like 4/5, that's perfectly fine by me.
So the algorithm for two-part names then becomes:
and
and ,
{dropping-particle non-dropping-particle Lastname, suffix, Firstname}
That is doable. I'll get to work on that.
For institutions etc, I'd prefer to have a separate issue and separate test cases.
What should the algorithm do if the firstname is quoted? ["retorquere + nickbart1980"] [first, von]
converts into
family: "retorquere + nickbart1980"
given: first
particle: von
Where those quotes are to be interpreted as "use literally"
["retorquere + nickbart1980"] [first, von]
should output {von {retorquere + nickbart1980, first}
, correct?
Which would make the updated algorithm:
So the algorithm for two-part names then becomes:
and
and ,
{dropping-particle non-dropping-particle Lastname, suffix, Firstname}
Which would make the updated algorithm: …
- Split name into non-dropping particle, lastname, firstname, dropping particle, and suffix.
- Convert each to LaTeX
- If it was quoted, surround with braces, if not brace
and
and,
- Do not do any caps protection (OK so that's not a step, but let's just be clear on it not being a step)
- output
{dropping-particle non-dropping-particle Lastname, suffix, Firstname}
- Profit
Looks good. We’ll still need a few minor tweaks, e.g., when a particle ends with ’
or -
, we should probably add a \relax
, as in author = {d’\relax Ormesson, Jean}
(see Tame the BeaST, 13.4, “How to remove space between von and Last?”); and for bibtex only, we might want to protect lowercase elements inside last names (see TTB, 13.3, “How to get lowercase letters in the Last?”; not needed for biblatex, see here and here).
OK, so:
and
and ,
\relax
and a space, otherwise, add a spaceBefore I forget, can you open a new issue for publisher, location etc?
OK, tests are running on the above algorithm.
- If non-dropping-particle ends in a space, don't change it; if it ends in a punctuation char, add \relax and a space, otherwise, add a space
dropping-particle, too!
Already included.
What would be the right course of action for a two-part name where only the last name is supplied?
(I'm going to assume you'd rather see {Lastname}
than {Lastname,}
)
A single word will always be parsed as Last, so no comma is needed. – But two or more words would be parsed as First Last
, so the comma might actually not be a bad idea. Need to investigate …
Cool, easy to change. In the interim, this has a few \relax
insertions of which I'd be interested whether they're OK this way.
No, I’ve been testing this, and the comma does not keep bibtex or biblatex from parsing {Foo Bar,}
as Foo=First and Bar=Last. So it seems multipart last names need to be wrapped in braces, just like corporate names.
What is a "multipart name"? Anything with whitespace? Anything with non-alphabetic characters? and this affects both last and firstnames? Do the particles always go outside the braces?
Anything with whitespace. This affects only lastnames. Particles should always go outside the braces, but OTOH, if there are particles, the braces aren’t even needed.
This looks good – but only for bibtex. Unfortunately, the \relax
trick does not seem to work for biblatex. I’ll have a closer look …
OK, then:
and
and ,
<space><lowercase letter><word boundary>
\relax<space>
if we're in BibTeX{<dropping-particle><non-dropping-particle><Lastname>, <suffix>, <Firstname>}
tests are running on https://github.com/ZotPlus/zotero-better-bibtex/issues/384#issuecomment-152109856
Changed to:
and
and ,
<space><lowercase letter><word boundary>
\relax<space>
if we're in BibTeX<space>
if the punctuation character is a period{<dropping-particle><non-dropping-particle><Lastname>, <suffix>, <Firstname>}
https://github.com/ZotPlus/zotero-better-bibtex/issues/384#issuecomment-152125210 passes all current tests, which means it has the same behavior as 1.6.2, except that empty last names don't generate a trailing comma. This probably means we don't have sufficient coverage; it would be a little surprising that the behavior was essentially OK as-is.
- ii. If not, and it's a last or first name, and it contains a space, brace entire part
Do not brace entire first names, or else abbreviation to initials won’t be correct.
and
and ,
<spaces><lowercase letters><word boundary>
\relax<space>
if we're in BibTeX<space>
if the punctuation character is a period{<dropping-particle><non-dropping-particle><Lastname>, <suffix>, <Firstname>}
tests are running
Tests have passed without changes to the test cases.
Are you satisfied with the current implementation?
I haven’t lost sight of this, but am much too busy with other stuff. I’ll try to upload test cases as soon as I can. Anyway, with 1.6.3, I still get:
@book{vangogh,
author = {van {Gogh}, {Vincent}},
options = {useprefix}
}
@book{humboldt,
author = {von {Humboldt}, {Alexander}}
}
@book{beauvoir,
author = {de {Beauvoir}, {Simone}}
}
@book{degaulle,
author = {{de Gaulle}, {Charles}}
}
@book{king,
author = {King, {Jr}., {Martin} {Luther}}
}
I’d say none of the braces around any of the elements are necessary, except those around {de Gaulle}
when Zotero’s lastname is protected by quotes: "de Gaulle"
.
1.6.3 doesn't have these changes yet. All these changes are on a separate branch which I mean to merge as soon as I have decent confidence that it works as intended -- no rush, but I'm going to hold off until I have the test cases.
A few test cases: 8ZTVU26A
Expected biblatex output:
@book{vangogh,
author = {van Gogh, Vincent},
options = {useprefix=true}
}
@book{humboldt,
author = {von Humboldt, Alexander}
}
@book{beauvoir,
author = {de Beauvoir, Simone}
}
@book{degaulle,
author = {{de Gaulle}, Charles}
}
@book{king,
author = {King, Jr., Martin Luther}
}
@book{stevenson,
author = {Stevenson, III, Adlai E.},
options = {juniorcomma=true}
}
@book{nationalaeronauticsandspaceadministration,
author = {{National Aeronautics and Space Administration}}
}
@book{bovendeert,
author = {boven d' Eert, Christianus},
options = {useprefix=true}
}
@book{s-gravesande,
author = {'s- Gravesande, Goverdus},
options = {useprefix=true}
}
@book{dequincey,
author = {De Quincey, Thomas}
}
@book{ortegaygasset,
author = {Ortega y Gasset, José}
}
@book{damato,
author = {D’Amato, Alfonse}
}
@book{sadat,
author = {el- Sadat, Anwar}
}
@book{lafollette,
author = {La Follette, Sr., Robert M.}
}
@book{delamare,
author = {de la Mare, Walter},
options = {useprefix=true}
}
@book{degette,
author = {DeGette, Diana}
}
@book{saunders,
author = {Saunders, John Bertrand de Cusance Morant}
}
@book{marcusaurelius,
author = {{Marcus Aurelius},}
}
@book{dumas,
author = {Dumas, père, Alexandre}
}
@book{vanrensselaer,
author = {Van Rensselaer, Stephen}
}
@book{lenfant,
author = {L’Enfant, Pierre-Charles}
}
@book{vangulik,
author = {{van Gulik}, Robert}
}
@book{sackville-west,
author = {Sackville-West, Victoria}
}
@book{vaughanwilliams,
author = {Vaughan Williams, Ralph}
}
@book{miesvanderrohe,
author = {Mies van der Rohe, Ludwig}
}
@book{dalembert,
author = {d’ Alembert, Jean le Rond},
options = {useprefix=true}
}
@book{tocqueville,
author = {de Tocqueville, Alexis}
}
@book{lafontaine,
author = {de La Fontaine, Jean}
}
@book{lasalle,
author = {de La Salle, René-Robert Cavelier}
}
@book{dupuydeclinchamps,
author = {du Puy de Clinchamps, Philippe},
options = {useprefix=true}
}
@book{stein,
author = {vom und zum Stein, Heinrich Friedrich Karl}
}
@book{silva,
author = {da Silva, Agostinho}
}
@book{dagama,
author = {da Gama, Vasco},
options = {useprefix=true}
}
@book{dannunzio,
author = {D’Annunzio, Gabriele}
}
@book{daponte,
author = {Da Ponte, Lorenzo}
}
@book{dellarobbia,
author = {Della Robbia, Luca}
}
@book{este,
author = {Este, Beatrice d’}
}
@book{medici,
author = {Medici, Lorenzo de’}
}
@book{al-hakim,
author = {al- Hakim, Tawfiq},
options = {useprefix=true}
}
@book{levayer,
author = {Le Vayer, François de La Mothe}
}
Expected bibtex output:
@book{vangogh,
author = {{van Gogh}, Vincent},
}
@book{humboldt,
author = {von Humboldt, Alexander}
}
@book{beauvoir,
author = {de Beauvoir, Simone}
}
@book{degaulle,
author = {{de Gaulle}, Charles}
}
@book{king,
author = {King, Jr., Martin Luther}
}
@book{stevenson,
author = {Stevenson, III, Adlai E.},
}
@book{nationalaeronauticsandspaceadministration,
author = {{National Aeronautics and Space Administration}}
}
@book{bovendeert,
author = {{boven d'Eert}, Christianus},
}
@book{s-gravesande,
author = {'s-Gravesande, Goverdus},
}
@book{dequincey,
author = {De Quincey, Thomas}
}
@book{ortegaygasset,
author = {Ortega y Gasset, José}
}
@book{damato,
author = {D'Amato, Alfonse}
}
@book{sadat,
author = {el-Sadat, Anwar}
}
@book{lafollette,
author = {La Follette, Sr., Robert M.}
}
@book{delamare,
author = {{de la Mare}, Walter},
}
@book{degette,
author = {DeGette, Diana}
}
@book{saunders,
author = {Saunders, John Bertrand de Cusance Morant}
}
@book{marcusaurelius,
author = {{Marcus Aurelius},}
}
@book{dumas,
author = {Dumas, père, Alexandre}
}
@book{vanrensselaer,
author = {Van Rensselaer, Stephen}
}
@book{lenfant,
author = {L'Enfant, Pierre-Charles}
}
@book{vangulik,
author = {{van Gulik}, Robert}
}
@book{sackville-west,
author = {Sackville-West, Victoria}
}
@book{vaughanwilliams,
author = {Vaughan Williams, Ralph}
}
@book{miesvanderrohe,
author = {Mies van der Rohe, Ludwig}
}
@book{dalembert,
author = {d'Alembert, Jean le Rond},
}
@book{tocqueville,
author = {de Tocqueville, Alexis}
}
@book{lafontaine,
author = {de La Fontaine, Jean}
}
@book{lasalle,
author = {de La Salle, René-Robert Cavelier}
}
@book{dupuydeclinchamps,
author = {{du Puy de Clinchamps}, Philippe},
}
@book{stein,
author = {vom und zum Stein, Heinrich Friedrich Karl}
}
@book{silva,
author = {da Silva, Agostinho}
}
@book{dagama,
author = {{da Gama}, Vasco},
}
@book{dannunzio,
author = {D'Annunzio, Gabriele}
}
@book{daponte,
author = {Da Ponte, Lorenzo}
}
@book{dellarobbia,
author = {Della Robbia, Luca}
}
@book{este,
author = {Este, Beatrice d'}
}
@book{medici,
author = {Medici, Lorenzo de'}
}
@book{al-hakim,
author = {al-Hakim, Tawfiq},
}
@book{levayer,
author = {Le Vayer, François de La Mothe}
}
(I’ve used dumb apostrophes for biblatex; not sure whether you also want to asciify Unicode chars such as é
and ç
.)
Super. Most things pass, but some do not:
's-
/d'
/al-
to be a particle, so that triggers useprefix. I assume that's OK.There is something called a "zero width space" after the apostrophe that currently causes this; what does a ZWS mean there? Should I just ignore it?
@nickbart1980 says: