zotero / citeproc-rs

CSL processor in Rust.
https://cormacrelf.github.io/citeproc-wasm-demo/
Other
73 stars 11 forks source link

Don't capitalize words with internal capitalization #145

Open dstillman opened 2 years ago

dstillman commented 2 years ago

The workaround in https://forums.zotero.org/discussion/93820/how-to-preserve-lowercase-i-in-iphone-imessage-etc-in-titles is dumb and shouldn't be necessary.

https://whoo.ps/2013/10/23/how-do-you-capitalize-iphone-or-ipad quotes the Chicago guide on this:

Brand names or names of companies that are spelled with a lowercase initial letter followed by a capital letter (eBay, iPod, iPhone, etc.) need not be capitalized at the beginning of a sentence or heading, though some editors may prefer to reword.

citeproc-js supposedly fixed this in https://github.com/Juris-M/citeproc-js/issues/178 (where I note that it's a more general problem than the title suggests), though we haven't been able to update to that version in Zotero due to a disambiguation-related hang.

@bwiernik suggests there that this requires a CSL spec change. That's presumably because some styles might call for "IPhone" (though in that case only at the start of a sentence), but that's gross and I'd default it to off in Zotero (with a flag, if necessary) until there was a CSL option.

bwiernik commented 2 years ago

I don't recall the previous discussion where I said a spec change was needed. I think I must have been thinking about adjusting the definition of the casing styles to include this, rather than needing to have it customizable. I don't know of any scholarly styles that capitalize this. @adamsmith do you?

I agree that changing to IPhone is dumb.

I agree the default should be to not apply casing rules if the word has internal capitalization.

A related issue is for all-lowercase software packages. R packages for example are usually all-lowercase. The titles that are cited usually look like: package: Description of the package. The name of the package shouldn't be capitalized because the package names are case sensitive. So, I have a ton of items in my library where I have to wrap the first word in span tags.

I think we should also not apply casing rules to the first word of the title if it is all lowercase.

Finally, can we consider a more compact syntax for casing controls than the current span class="nocase" html like tags? Perhaps Latex style {} ?

dstillman commented 2 years ago

You said it in the thread I linked to:

https://github.com/Juris-M/citeproc-js/issues/178#issuecomment-779578112

You may have just meant changing the default behavior, but the AP Stylebook example I link to shows that at least one style does expect "IPhone" at the beginning of sentences. I don't particularly care about that, though, and would be happy just to change the default behavior.

I think we should also not apply casing rules to the first word of the title if it is all lowercase.

You mean if the first word is lowercase but there's other capitalization in the title?

Finally, can we consider a more compact syntax for casing controls than the current span class="nocase" html like tags? Perhaps Latex style {} ?

I don't see much point in moving away from the HTML subset at this point, and mixing in a LaTeX convention seems a little clumsy. The HTML is at least unambiguous for parsing — I don't think we want to start just stripping { and } in titles.

This is of course separate from how it needs to be displayed in an editing interface like Zotero, where we've always planned to support visual rich-text editing. (For case protection, we'd need some sort of other visual indicator, but that's still unrelated to the underlying markup passed to the processor.)

bwiernik commented 2 years ago

Ah, yes I just meant clarifying.

For the package names, I meant if the first word is all lowercase. I don't think there needs to be other capitalization in the title to apply. Eg, "dplyr: a grammar of data manipulation" should be unchanged in Vancouver sentence case, "dplyr: A grammar of data manipulation" in APA sentence case and "dplyr: A Grammar of Data Manipulation" in title case.

cormacrelf commented 2 years ago

Can't you solve that with the Big Title Split of 2022 for software item types? Main title no casing, subtitle with casing.

bwiernik commented 2 years ago

This is something that should apply to all styles