coderholic / django-cities

Countries and cities of the world for Django projects
MIT License
920 stars 374 forks source link

Can not slugify() german umlauts (ÄÖÜ) #162

Open sowinski opened 7 years ago

sowinski commented 7 years ago

Checklist

Steps to reproduce

Download at least German cities. (Having problems with ÖÜÄ) City.objects.filter(name="Düsseldorf").first().slugify() Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/dist-packages/cities/models.py", line 216, in slugify return '{}-{}'.format(self.id, unicode_func(self.name)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 1: ordinal not in range(128)

Environment: django-cities==0.5.0.3 Django==1.10.4 Python 2.7.6 MySql 5.5 Ubuntu trusty 32Bit

Tried to write my own slugify function, but the code is crashing before on line:

models.py

216: return '{}-{}'.format(self.id, unicode_func(self.name))

Is it possible to define my own get_absolute_url somehow?

blag commented 7 years ago

I'm just trying to figure out the actual issue here, but...do you get this error with Python 3?

So far I've focused on writing tests for the import script. This would be a good candidate for some unit tests.

blag commented 7 years ago

If you can find the exact line in the German cities text file from Geonames I'll fix this issue and add the line to the test data to make sure we don't break it again.

moorchegue commented 6 years ago

I have this same problem on Python 3.6.2, Django 1.11.2. A lot of cities affected:

In [27]: City.objects.filter(name="Düsseldorf").first().slugify()                      
Out[27]: '2934246-Düsseldorf'              

In [28]: City.objects.filter(name="Montréal").first().slugify()                                                                                                              
Out[28]: '2992118-Montréal'                                                                                                                                                  

In [29]: City.objects.filter(name="Toruń").first().slugify()                                                                                                                 
Out[29]: '3083271-Toruń'                                                                                                                                                     
moorchegue commented 6 years ago

Related (probably) question. What are the ways to find a city by the "normalized" form? In the cities5000.txt the 3rd column is actually always a somehow normalized form. Is it being imported? City.name as well as English in alt_names are different.

Düsseldorf specifically becomes "Duesseldorf", which I believe is not considered a correct spelling (?), but even that is helpful. The rest of the city names I ran into so far would work perfectly with the 3rd column values.

Sorry about the necroposting. Should I open another issue for this?

moorchegue commented 6 years ago

Answering my own stupid question: it's name_std. Please disregard the last message.

blag commented 6 years ago

@moorchegue No worries about necroposting or asking your question! 😄 Do you have the issue where Python throws a unicode encoding error as well?

moorchegue commented 6 years ago

Nope. I believe with Python 3 unicode issues are a thing of the past. Am I right? ☺

blag commented 6 years ago

@moorchegue I certainly hope so, and I think that's the case since you aren't having issues with Python 3. I'm just not sure if @sowinski's issue still exists with Python 3 or if it's just a Python 2 problem. Sounds like it's just a Python 2 problem. :)

sowinski commented 6 years ago

Works with python3 fine