plone / plone.i18n

Text normalization logic and language, country, cctld data.
8 stars 11 forks source link

Allow the "General punctuation" block of unicode chars in the base norma... #6

Closed tmog closed 10 years ago

tmog commented 11 years ago

...lizer (things like em dash). Also do not add hex value for the chars we do not handle. Em dash is a good example why this is bad (especially this year) - it has a hex value of 2013.;-)

davisagli commented 11 years ago

Would be nice to have a test for the punctuation handling.

Also, it seems like we should put something for unrecognized characters. Otherwise normalization of titles that are made entirely of unrecognized characters (CJK languages, for example) will always produce an empty normalization.

bosim commented 11 years ago

@davisagli You are right. Maybe the existing solution is better.

jensens commented 10 years ago

@bosim as far as i understand this means we can close this one, so i do. I I'am wrong please reopen.