zotero / translators

Zotero Translators
http://www.zotero.org/support/dev/translators
1.3k stars 765 forks source link

Proper nouns are not escaped in Biblatex export #1667

Open adam-ah opened 6 years ago

adam-ah commented 6 years ago

(Bug moved here from https://github.com/retorquere/zotero-better-bibtex/issues/975) There was a previous PR a while ego to preserve (wrap with {}) the proper nouns in titles when exporting to Biblatex. In some cases it still does work (for instance, ERG is turned into {ERG}) but other cases don't ("...of Facebook on..." stays as is.) Original PR https://github.com/zotero/translators/pull/813

Apparently Bibtex.js export has this fix, Better Biblatex supports this too, but missing from BibLaTeX.js

Exporting references

Sample working well:

@article{ordy_visual_1968,
    title = {Visual acuity and {ERG}-{CFF} in relation to the morphologic organization of the retina among diurnal and nocturnal primates},
    volume = {8},
    issn = {00426989},
    url = {http://linkinghub.elsevier.com/retrieve/pii/004269896890028X},
    doi = {10.1016/0042-6989(68)90028-X},
    pages = {1205--1225},
    number = {9},
    journaltitle = {Vision Research},
    author = {Ordy, J.M. and Samorajski, T.},
    urldate = {2018-05-19},
    date = {1968-09},
    langid = {english}

Broken sample:

@article{fardouly_social_2015,
    title = {Social comparisons on social media: the impact of Facebook on young women's body image Piglet concerns and mood},
    volume = {13},
    issn = {17401445},
    url = {http://linkinghub.elsevier.com/retrieve/pii/S174014451400148X},
    doi = {10.1016/j.bodyim.2014.12.002},
    shorttitle = {Social comparisons on social media},
    abstract = {....},
    pages = {38--45},
    journaltitle = {Body Image},
    author = {Fardouly, Jasmine and Diedrichs, Phillippa C. and Vartanian, Lenny R. and Halliwell, Emma},
    urldate = {2018-05-12},
    date = {2015-03},
    langid = {english},
    file = {...}
retorquere commented 6 years ago

For clarity's sake, the bug wasn't moved here in the sense that it was a BBT problem triggered by Zotero, it's just that the stock bib(la)tex exporters have a bug. It was first reported on the bbt tracker, but that's the only relation.

adam-ah commented 6 years ago

If anyone wants to create & merge the fix, the following patch will bring BibLatex.js exporting to the same escaping functionality as the current Bibtex.js:

@@ -60,6 +60,15 @@
    PMID: 'pmid',
    DOI: 'doi'
 };
+//
+// Fields for which upper case letters will be protected on export
+var caseProtectedFields = [
+   "title",
+   "type",
+   "shorttitle",
+   "booktitle",
+   "series"
+];

 // Imported by BibTeX. Exported by BibLaTeX only
 var revEprintIds = {
@@ -277,9 +286,8 @@

        // Case of words with uppercase characters in non-initial positions is preserved with braces.
        // we're looking at all unicode letters
-       var protectCaps = new ZU.XRegExp("\\b\\p{Letter}+\\p{Uppercase_Letter}\\p{Letter}*", 'g')
-       if (field != "pages") {
-           value = ZU.XRegExp.replace(value, protectCaps, "{$0}");
+       if (caseProtectedFields.indexOf(field) != -1) {
+           value = ZU.XRegExp.replace(value, protectCapsRE, "$1{$2$3}"); // only $2 or $3 will have a value, not both
        }

        // Page ranges should use double dash
@@ -443,7 +451,22 @@
    return value.replace(encodeFilePathRE, "\\$&");
 }

-   function doExport() {
+var protectCapsRE;
+function doExport() {
+   if (Zotero.getHiddenPref && Zotero.getHiddenPref('BibTeX.export.dontProtectInitialCase')) {
+       // Case of words with uppercase characters in non-initial positions is
+       // preserved with braces.
+       // Two extra captures because of the other regexp below
+       protectCapsRE = new ZU.XRegExp("()()\\b(\\p{Letter}+\\p{Uppercase_Letter}\\p{Letter}*)", 'g');
+   } else {
+       // Protect all upper case letters, even if the uppercase letter is only in
+       // initial position of the word.
+       // Don't protect first word if only first letter is capitalized
+       protectCapsRE = new ZU.XRegExp(
+           "(.)\\b(\\p{Letter}*\\p{Uppercase_Letter}\\p{Letter}*)" // Non-initial words with capital letter anywhere
+               + "|^(\\p{Letter}+\\p{Uppercase_Letter}\\p{Letter}*)" // Initial word with capital in non-initial position
+           , 'g');
+   }
        //Zotero.write("% biblatex export generated by Zotero "+Zotero.Utilities.getVersion());
        // to make sure the BOM gets ignored
        Zotero.write("\n");

When applied the original broken sample will change like this:

@@ -1,6 +1,6 @@

 @article{fardouly_social_2015,
-   title = {Social comparisons on social media: the impact of Facebook on young women's body image concerns and mood},
+   title = {Social comparisons on social media: the impact of {Facebook} on young women's body image concerns and mood},
    volume = {13},
    issn = {17401445},
    url = {http://linkinghub.elsevier.com/retrieve/pii/S174014451400148X},