ppannuto / python-titlecase

Python library to capitalize strings as specified by the New York Times Manual of Style
MIT License
244 stars 36 forks source link

Not preserving all-cap words when hyphenated or single-word titles #77

Closed dbkinder closed 3 years ago

dbkinder commented 3 years ago

While "Handling ASCII words" is transformed as expected into "Handling ASCII Words", adding a hyphen breaks things so "Handling non-ASCII words" transforms to "Handling Non-Ascii Words" (instead of the expected "Handling Non-ASCII Words".

Also, the one-word title "ASCII" is unexpectedly transformed to "Ascii", but a two-word title "ASCII text" is transformed to "ASCII Text" and "text ASCII" is transformed to "Text ASCII" (both as expected).

dbkinder commented 3 years ago

This may be a similar situation to #68

I notice adding a vowel causes this behavior too (and "ASCII" has plenty of vowels :)

>>> titlecase("XXX")
'XXX'
>>> titlecase("XAXX")
'Xaxx'
>>> titlecase("non-XXX")
'Non-XXX'
>>> titlecase("non-XAXX")
'Non-Xaxx'
>>>
dbkinder commented 3 years ago

... so I tried your callback function approach, to no avail (and the test case in the README didn't work either, UDP didn't get capitalized):

Python 3.6.9 (default, Oct  8 2020, 12:12:24)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from titlecase import titlecase
>>> def abbreviations(word, **kwargs):
...    if word.upper() in ('TCP', 'UDP', 'ASCII', 'CPU'):
...       return word.upper()
...
>>> titlecase("a simple tcp and udp wrapper")
'A Simple TCP and Udp Wrapper'
>>> titlecase("handling ASCII words")
'Handling ASCII Words'
>>> titlecase("handling non-ASCII words")
'Handling Non-Ascii Words'
>>> titlecase("handling ascii words")
'Handling Ascii Words'
dbkinder commented 3 years ago

Looking a bit further, PR #4 has the solution: I needed to call titlecase like so:

titlecase("a simple tcp and udp wrapper", callback=abbreviations)

That worked for me. Sigh, if I had read the README more closely, that was indeed what it said to do, so closing this issue :)