wildtreetech / sentinel2-bot

🛰📷🌏 Tweeting pictures taken by the Sentinel 2 satellite.
https://twitter.com/sentinel2bot
MIT License
36 stars 11 forks source link

Counting characters #12

Open betatim opened 7 years ago

betatim commented 7 years ago

Need to do better at counting characters. It is all about the unicode!

For example 'Laâyoune-Sakia El Hamra ⵍⵄⵢⵓⵏ-ⵙⴰⵇⵢⴰ ⵍⵃⴰⵎⵔⴰ العيون-الساقية الحمراء, Western Sahara' is 81 characters but if you encode it properly: len(unicodedata.normalize("NFC", 'Laâyoune-Sakia El Hamra ⵍⵄⵢⵓⵏ-ⵙⴰⵇⵢⴰ ⵍⵃⴰⵎⵔⴰ العيون-الساقية الحمراء, Western Sahara').encode('utf-8')) it is 134 "characters"

betatim commented 7 years ago

The hardest part is working out how to shorten things. Often (I think) the unicode version is the local spelling of the english version.