larsgw / citation.js

Citation.js converts formats like BibTeX, Wikidata JSON and ContentMine JSON to CSL-JSON to convert to other formats like APA, Vancouver and back to BibTeX.
https://citation.js.org/
MIT License
222 stars 30 forks source link

BibTeX capitalisation and CSL nocase #155

Closed larsgw closed 6 years ago

larsgw commented 6 years ago

When working with BibTeX input, titles fields are sometimes entirely wrapped in an extra pair of brackets, e.g.

title = {{foo}}

Should this result in the whole title being wrapped in <span class="nocase">...</span>, as per https://www.zotero.org/support/kb/rich_text_bibliography?

Moreover, should BibTeX output automatically have an extra {...} wrapping title fields, which is what happens now?

@rmzelle: Thoughts?

rmzelle commented 6 years ago

Sorry, I don't know anything about the BibTeX format. @adam3smith, do you know who is our residential expert? Emiliano or Karnesky?

adam3smith commented 6 years ago

I think @retorquere has looked at the curly bracket issue. Basically this is a debated topic: plain bibtex will protect case for titles with double brackets, but biber/biblatex will just ignore them. See https://tex.stackexchange.com/a/327387 Using double brackets for the title is considered bad style and I'd be inclined to side with biber and not wrap the title in no case. It's also the case that bibtex and CSL handle capitalization exactly in reverse (in bibtex you need to preserve uppercase letters in Zotero lowercase ones) so the two don't exactly translate.

retorquere commented 6 years ago

If you're importing {{...}}, that should in principle translate to a nocase wrap. I don't do this in BBT though.

BibTeX output should absolutely not automatically be wrapped in an extra pair of braces. I see this recommendation occasionally but it's simply wrong. There are bibtex styles that demand legitimately that they can do their own case folding, and the second pair of braces prevents this. There is an unfortunate difference in baseline assumptions about how titles should be written in Zotero (sentence case) and bibtex (Title Case) that cannot just be ignored.

retorquere commented 6 years ago

Some of the madness of brace protection and caps meddling is documented here. It's not an easy subject with a few subtle edge cases.

retorquere commented 6 years ago

BTW if the concern is parsing (for later processing), the parser BBT uses has been split off and now lives here.

larsgw commented 6 years ago

Okay, thanks @retorquere (and @rmzelle and @adam3smith too of course). Citation.js both parses BibTeX into CSL-JSON and converts CSL-JSON back into BibTeX, among other things. This is what I did in v0.4.0-9 (released just now):

retorquere commented 6 years ago

It doesn't need to be, a single brace will most of the time, just not always work. {{ always works. A lot of people (me included) prefer {...} for the looks, but deciding algorithmically when {...} will work was more work than I was willing to do.

If you're looking for bib(la)tex test cases BTW, have at it.

retorquere commented 6 years ago

I kept the translation between span class="nocase" and {...} for now, both importing and exporting. Now that I read the FAQ perhaps that should be {{...}}?

This argument only goes for exporting BTW. Most of the refs for import you will find will not use {{...}}. You're braver than me if you want to parse that fully as bibtex/biblatex intends it (even ignoring the differences between them), but translating {...} into <span class="nocase">...</span> (which I don't) is in principle the right thing to do.

larsgw commented 6 years ago

Then I'll leave the behaviour like this (for now at least, maybe I'll think differently about it tomorrow). Thanks a lot for the help!

If you're looking for bib(la)tex test cases BTW, have at it.

Thanks, I'll definitely take a look at that.