ppannuto / python-titlecase

Python library to capitalize strings as specified by the New York Times Manual of Style
MIT License
244 stars 36 forks source link

Translates unicode chars poorly #79

Closed bernd-wechner closed 3 years ago

bernd-wechner commented 3 years ago

I am using Greek letters and I find titlecase is destroying them:

Specifically: 'ß' becomes 'SS'

Have looked at the source, but am stuck for now, it's the CAPFIRST regex doing this. If I can work it out, I'll report back. For now parking an issue. I may just work around it ;-)

bernd-wechner commented 3 years ago

Working around it with a callback.

ppannuto commented 3 years ago

Yeah, unfortunately this is a known issue. It's listed in the limitations on the top level README. Open to PRs with fixes, but not something I have a solution to or plan for at the moment.

bernd-wechner commented 3 years ago

As time permits will consider a PR. I'd leave the issue open though if I were you, to promote just that. Closed issues are easily forgotten, and this one isn't closed per se, so much as pending ... a PR to fix it ;-). I took a quick look at the code just now and decided it'll code me some time to look deeper of course, it's does indeed define a string of REs that are best tested with some spare time. I'd need first to diagnose which one where is pulling an 'ß' to 'SS' switch and then looking into improvements.