Closed furkansimsekli closed 1 year ago
Maybe another library, PyICU can be used for multilingual sorts. Here is a quick view for how Unicode Collation Algorithm works: http://unicode.org/reports/tr10/ , which is used in PyICU.
I tested #20 , and explained what went wrong there. So built-in locale isn't an option anymore.
I also tested PyICU on my local machine, it was very easy to implement. However, I struggled while trying to build PyICU on Heroku. I've tried many solutions including this and this.
If anyone built PyICU on Heroku before, I'd appreciate the helps!
What about doing some text normalization as an easy workaround? e.g.
NORMAL_MAP: dict[int, int] = str.maketrans("ĞğıİÖöŞşÜü", "GgiIOoSsUu")
def normalize(inp: str) -> str:
return inp.translate(NORMAL_MAP)
You could pass the names as a tuple (normalized, real)
and use the real name afterwards.
If anyone built PyICU on Heroku before, I'd appreciate the helps!
Would you like help with hosting the bot in a proper VPS? I can lend you some space in mine if you want.
@div72 Actually I was looking for a permanent solution, the title says "Turkish .." but it's a problem for other languages too. If we use such solution approach, we need to tweak it whenever there is a new language introduced. I also saw a nice solution on stackoverflow.
To be honest, it's silly to expect tens of languages for this app. So, using temporary solution might not be a bad idea after all :)
Would you like help with hosting the bot in a proper VPS? I can lend you some space in mine if you want.
Thanks for the VPS proposal, however since my all Telegram Bots -which there are four of them- live on Heroku, I'd want to keep them together. Working with different environments sometimes can be exhausting. If I change my mind, you are the person I'll disturb first :)
Another possible solution might be including the alphabet of each language in its translation file. There we can store the correct orders of letters for each language, and then a custom sort key can handle rest of the job.
Wrong issue number in commit. It must have been #9 instead of #19
Turkish chars like "İ, Ç, Ş" breaks the
sort()
function. Therefore, in the department list, they are being at the end of the list although they are not supposed to be there.Need to implement a new sorting function. It could be done with
sorted(list, key)