Closed bwiernik closed 2 years ago
Is this for import, or for the right-click option? I see now that I have different implementations for those, I'm going to unify them, but it'd help to know what your baseline is.
BTW thank you for including a debug log.
I was thinking in the right click option
It seems that BBT sentence case is keeping single-character words capitalized following #1780. You might want to keep "vitamin A" unchanged. (Issue #1742 might also be related.)
Example:
insight: A Unified Interface to Access Information from Model Objects in R
I guess "A" is capitalized here following the APA Style title case:
In title case, capitalize the following words in a title or heading:
- the first word of the title or heading, even if it is a minor word such as “The” or “A”
So perhaps BBT sentence case could check for such cases. However, keeping "A" capitalized might also be ok. See the example at the end of this APA Style guide from September 2019.
APA style guide is irrelevant—Zotero applies APA sentence case as needed. An initial A at the beginning of the subtitle is the most common case, much more common than vitamin A, so making A an exception to the single letter upper case rule is best
:robot: this is your friendly neighborhood build bot announcing test build 6.4.3.2432 ("sometimes simpler is better")
Install in Zotero by downloading test build 6.4.3.2432, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".
All all-caps words are now being lowercased, rather than just A
.
For example Type B Personality: A Meta-Analysis
becomes Type b personality: a meta-analysis
rather than Type B personality: a meta-analysis
.
And Structured Interviewing for OCB: Construct Validity, Faking, and the Effects of Question Type
becomes Structured interviewing for ocb: construct validity, faking, and the effects of question type
rather than Structured interviewing for OCB: construct validity, faking, and the effects of question type
Two sample items: SJ6CUYRJ-refs-euc
I guess the change suggested here is to replace : A
with : a
. All single-character A
's in the middle of a sentence could be kept unchanged. So add something like
title.replace(/: A /g, `: a `);
No, it really should be any single A. In nearly all cases, an uppercase A is an error
:robot: this is your friendly neighborhood build bot announcing test build 6.4.3.2435 ("new cases for sentence-caser")
Install in Zotero by downloading test build 6.4.3.2435, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".
No, it really should be any single A. In nearly all cases, an uppercase A is an error
There's no title case rule listed here that capitalizes an article "a" in the middle of a sentence. If you're dealing with erroneous title casing to start with, you could apply Zotero's sentence case function that is more aggressive.
I'm OK with this sentence casing:
Type B Personality: A Meta-Analysis
Type B personality: a meta-analysis
But I would then also expect:
Type A Personality: A Meta-Analysis
Type A personality: a meta-analysis
These A's should be preserved in my opinion:
Period after a sentence. A new sentence.
Vitamins A, B, and C.
Hepatitis A and hepatitis B vaccines.
Treatment of hepatitis A. Treatment of hepatitis B.
U S A / U. S. A. / U.S.A. / N.A.S.A. / A.M.
Well, "A.M." could be turned into "a.m.", but then you'd also end up with "u.s.a.".
Acronyms that are a repetition of Capital-Period get special treatment in my sentence caser.
Acronyms now seem to not function well.
As mentioned in https://github.com/retorquere/zotero-better-bibtex/issues/2123#issuecomment-1104871248, the original title is: Is FFT Fast Enough for Beyond 5g Communications?
Expected: Is FFT Fast Enough for Beyond 5g Communications?
What BBT provides: Is fft fast enough for beyond 5g communications?
The acronym FFT
is not considered in the sentence caser.
72XTXMTJ-apse
What BBT provides:
Is fft fast enough for beyond 5g communications?
Not on build 2435
What BBT provides:
Is fft fast enough for beyond 5g communications?
Not on build 2435
Installed build 2435 provides what I expected Is FFT fast enough for beyond 5g communications?
but build 2447 still goes wrong: Is fft fast enough for beyond 5g communications?
2447 debug id: 5B4THDCR-apse
Build 2447 was not built on this issue. Separate issues have separate builds, so 2447 does not contain the code present in 2435.
Build 2447 was not built on this issue. Separate issues have separate builds, so 2447 does not contain the code present in 2435.
Got it. Would use build 2435 for now. Thanks!
:robot: this is your friendly neighborhood build bot announcing test build 6.5.1.2450 ("Merge branch 'master' into gh-2078")
Install in Zotero by downloading test build 6.5.1.2450, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".
cool. build 2450 would also do the job.
As soon as we have a consensus here I will release a new version, these builds will auto-update to the formal release.
looks good to me
I tested 2450. This is a title before and after using BBT's title case function:
Q&A: Vitamins A, B, and C. Hepatitis A and hepatitis B vaccines. Treatment of hepatitis A. Treatment of hepatitis B.
Q&a: vitamins a, B, and c. hepatitis a and hepatitis B vaccines. Treatment of hepatitis a. treatment of hepatitis b.
I'd expect the title to remain unchanged.
I don't really have a strong opinion one way or the other, so I'd prefer it if you guys could come to a consensus. I do not want to introduce a new configuration preference for this though, and given the naiveté of the sentence caser, it is always on the user to inspect & correct the results. I'm not doing any kind of NLP in the sentence-casing.
The test qqobb gave is fine. Matching : A
and —A
and returning : a
and —a
to lowercase initial A in subtitles but leaving others is fine
Do you have a sample title for the latter case?
insight—A Unified Interface to Access Information from Model Objects in R
So
Q&A: A Vitamin A, B, and C Study. Hepatitis A and Hepatitis B Vaccines. Treatment of Hepatitis A. Treatment of Hepatitis B.
becomes
Q&A: a vitamin A, B, and C study. Hepatitis A and hepatitis B vaccines. Treatment of hepatitis A. Treatment of hepatitis B.
Shouldn't em-dashes have spacing around them? insight—A
looks like a two-part word.
Oh wait, em-dashes are not hyphens. Got it.
So
insight—A Unified Interface to Access Information from Model Objects in R
becomes
insight—a unified interface to access information from model objects in R
Yes
alright, the updated sentencecaser will be in the next release.
Crossroads, Directions and A New Critical Race Theory
now sentence-cases to
Crossroads, directions and A new critical race theory
acceptable damage?
Are the other A
's than vitamins and hepatitides?
acceptable damage?
I'd say yes. It's the consequence of an erroneous title case. The title page (inside) of that book shows "Crossroads, Directions and a New Critical Race Theory". You can check this in Google books or Amazon.
Are the other
A
's than vitamins and hepatitides?
https://pubmed.ncbi.nlm.nih.gov/22628224/ Cholinergic-associated loss of hnRNP-A/B in Alzheimer's disease impairs cortical splicing and cognitive function in mice
https://pubmed.ncbi.nlm.nih.gov/3348967/ Clinical features and course of type A and type B vitiligo
https://pubmed.ncbi.nlm.nih.gov/3779524/ Evaluation of the reversed passive latex agglutination (RPLA) test kits for detection of staphylococcal enterotoxins A, B, C, and D in foods
https://psycnet.apa.org/record/1981-30747-001 Type A behavior, hostility, and coronary atherosclerosis.
https://www.nature.com/articles/ncomms15963 The A-B transition in superfluid helium-3 under confinement in a thin slab geometry
https://dl.acm.org/doi/abs/10.1145/3097983.3097992 Peeking at A/B Tests: Why it matters, and what to do about it
Alright, then the current state of things is going into the release.
Matching
: A
and—A
and returning: a
and—a
to lowercase initial A in subtitles but leaving others is fine
Add a final whitespace, so ": A " becomes ": a " and "—A " becomes and "—a ".
That's already in.
Support log: I9TH4EA5-refs-euc
BBT Sentence Case doesn't change the case of single uppercase letters like "C" or "R". However, "A" usually is the English article, rather than a proper noun. So, could "A" also be lowercased when the function is run:
Example:
insight: A Unified Interface to Access Information from Model Objects in R
Expected result:
insight: a unified interface to access information from model objects in R