retorquere / zotero-better-bibtex

Make Zotero effective for us LaTeX holdouts
https://retorque.re/zotero-better-bibtex/
MIT License
5.28k stars 284 forks source link

Export article title capitalisation: 'P-Type vs 'n-type' #1913

Open Dr-Sparx opened 3 years ago

Dr-Sparx commented 3 years ago

Zotero version: 5.0.96.3

BBT version: 5.4.29

Support log ID: XVZG6T98-refs-euc

Exporter used: Better BibTeX

Expected behavior: That the Zotero title entry "Design parameters for optimizing the efficiency of thermoelectric generators utilizing p-type and n-type lead telluride" of a journal article gets exported with title case applied as title = {Design Parameters for Optimizing the Efficiency of Thermoelectric Generators Utilizing P-Type and N-Type Lead Telluride}.

Actual behavior: Unlike the 'p' in 'p-type', the 'n' in 'n-type' remains in lower case and the exported title is title = {Design Parameters for Optimizing the Efficiency of Thermoelectric Generators Utilizing P-Type and n-Type Lead Telluride}.

retorquere commented 3 years ago

Zotero recognizes some markup for this: https://www.zotero.org/support/kb/rich_text_bibliography

So what you'd want is to have in Zotero

Design parameters for optimizing the efficiency of thermoelectric generators utilizing p-type and <span class="nocase">n-type</span> lead telluride

This will work in both BBT exports and in the Zotero Word functionality.

Dr-Sparx commented 3 years ago

Thanks for the swift response, but I think you swapped expected and actual behaviour.

Starting from 'p-type' and 'n-type' in Zotero's title field, BBT produces P-Type and n-Type in the exported .bib file. I cannot see any reason why the p is capitalised and the n is not.

The markup with nocase forces the 'n' to lower case in the exported .bib file, but that is not what should happen here. When the title is converted to title case on export with BBT, shouldn't 'n-type' be treated/capitalised the same as all the other 'proper' words, and in particular like 'p-type'?

retorquere commented 3 years ago

Oh, sorry, I did swap them. I'll have a look.

Dr-Sparx commented 3 years ago

The issue is labelled with awaiting-user-feedback, but I don't think there is anything for me to add right now. Or is there?

retorquere commented 3 years ago

Nope, it's on me now, just haven't had time yet.

retorquere commented 3 years ago

It's an error in Zotero more broadly: https://groups.google.com/g/zotero-dev/c/Kx5saARqP3w . I'll see if I can find a better title-caser, but chances are not great, and the CSL processor that Zotero uses moves a little slow sometimes.

Dr-Sparx commented 3 years ago

Okay, understood. Thanks for looking into this. And all your work on BBT is much appreciated — without it, I would not be using Zotero!

retorquere commented 3 years ago

You're very welcome.

@njbart, what is the proper titlecasing for something like

Does Measurement Instrument Moderate the Association Between the Serotonin Transporter Gene and Anxiety-Related Personality Traits?

and

The Integration of Scientific Techniques Into Archaeological Interpretation

Should Between and Into be capitalized?

njbart commented 3 years ago

It depends. APA would capitalize these two words, CMOS would not.

On the rules behind these and other capitalization styles, see https://titlecaseconverter.com/.

If bibtex or biblatex are the target formats, I’d probably capitalize wherever the requirements of the various styles differ (remember that bibtex and biblatex have a sentence-caser, but not a title-caser, so anything that is already in sentence case doesn’t even get a chance to become title-cased), and hope that the bibtex or biblatex styles used are smart enough to sentence-case whatever they see fit.

retorquere commented 3 years ago

If bibtex or biblatex are the target formats, I’d probably capitalize wherever the requirements of the various styles differ

That is indeed what I'm trying to target. I had hoped that there was a set title-case format for bib(la)tex in the same way there is one for CMOS or APA. I'm trying different javascript title-casers and they all fail in different ways in my test suite, but if I knew what to aim for, I could pick the best of the lot and try to fix it.

In the end there's no escaping that sometimes nocase hints need to be added but I'd preferably do the right thing by default wherever possible.

retorquere commented 3 years ago

It depends. APA would capitalize these two words, CMOS would not.

So given that, would I have between and into capitalized in the bib file, or not, if I wanted to have biblatex-chicago and biblatex-apa both do the right thing given the same input?

njbart commented 3 years ago

Wait - APA is not a good example: It does have title-casing rules, but title-case is actually used only in chapter/section headings, not in bibliographies (except for journal titles, but these should be entered as-is anyway). So APA sentence-cases titles, and anything title-cased by BBT according to the CMOS rules can be expected to come out correctly as well if biblatex-apa is used.

The problem is that there are other styles that call for title-casing titles in bibliographies, with slightly different rules: I’m aware of at least AMA and MLA (not sure whether there are others). I tend to think that CMOS is somewhat more popular that the others, so I would go with their rules as a default, but of course this would require some kind of BBT postscript or biblatex sourcemap trickery to get things 100% right in other styles.

And I was too optimistic about biblatex styles being smart WRT title-casing. At least biblatex-chicago (the only one I checked) outputs titles capitalized exactly as they are provided in the biblatex file.

retorquere commented 3 years ago

The problem is that there are other styles that call for title-casing titles in bibliographies, with slightly different rules: I’m aware of at least AMA and MLA (not sure whether there are others). I tend to think that CMOS is somewhat more popular that the others, so I would go with their rules as a default, but of course this would require some kind of BBT postscript or biblatex sourcemap trickery to get things 100% right in other styles.

And I was too optimistic about biblatex styles being smart WRT title-casing. At least biblatex-chicago (the only one I checked) outputs titles capitalized exactly as they are provided in the biblatex file.

I was afraid that would be the case.

I'll start off with CMOS, and I'll likely add preferences for stable alternate title-casings.

retorquere commented 3 years ago

I've tested the 8 libraries from npmjs that had updates in the last year; the library that gets the best (that is, acceptable) results is title but that has long-standing bugs that are deal-breakers but that seem to get no attention. I've posted on zotero-dev about the CSL title-caser error but that also seems to get no traction. Currently seems like 6 of the one, half a dozen of the other between them.

retorquere commented 2 years ago

What would you make of influenza-like-illness? Influenza-Like-Illness or Influenza-like-Illness?

Dr-Sparx commented 2 years ago

With like being a preposition, I think it should not be capitalised — I'd make it Influenza-like-Illness.