yourlabs / django-cities-light

A simple app providing three models: Country, Region and City model. Also provided, a command to insert or update data from geonames database dumps. Status: stable.
http://django-cities-light.rtfd.org/
MIT License
336 stars 126 forks source link

Should the slug for City, Region and Country be unique? #51

Closed unxavi closed 10 years ago

unxavi commented 10 years ago

I am not sure if i am confused, but i understand that the base model define the field "slug" with the intention of being unique. But for example for cities at least it is not, a search for "valencia" as the slug returns 4 results.

When doing a detail view searching a city by the slug field, returns an error, because it finds 4 objects and not one.

django-cities-light

So, shouldnt the "slug" field be define as unique?

Thanks very much by the way, think i will be using this, so i hope to help around

jpic commented 10 years ago

I don't see how we could define the city slug as unique.

unxavi commented 10 years ago

It might be a stupid question but at least i will know the why.

In yours models module, your "Base" Class, the slug field, shouldnt it be define as unique=True?

I mean this line:

slug = autoslug.AutoSlugField(populate_from='name_ascii')

Could it be?

slug = autoslug.AutoSlugField(populate_from='name_ascii', unique=True)
jpic commented 10 years ago

The problem is that many cities have the same name, and sometimes even the same country like springfield (USA).

unxavi commented 10 years ago

If they have the same name, like cities do, isn't that a reason to work with unique slugs?

thats why i ask about define a unique slug for the countries, cities, and regions.... so if a city has the same name as others and same country, etc... you can still reach them by a different slug in the url and not by the pk.

I tested the change i mentioned, so I got unique slugs.

Example:

unique slug cities light

If you want, I can make the change and ask for a pull request. it takes like 1 more minute on the import all because of the time checking unique slugs.

I dont know if i make myself clear about why the unique slugs.

In the springfield case the result is getting different slug for every city, while you can still keep the name and name_ascii as "Springfield"

springfield

let me know what you think about this change, i dont see the point of having 11 cities with the same slug "springfield" for all of them.

Anyways thanks

jpic commented 10 years ago

Well the point is to have urls of /country-slug/region-slug/city-slug/ like:

/united-states/texas/springfield/ /united-states/virginia/springfield/

I think it would break urls if we changed that to append random numbers at the end of city slug names, so I'm not going to pull it.

Also, I recommend that you prefix your city urls with region and country slugs.

unxavi commented 10 years ago

Hey,

Cool, thanks for your answer, I see your point with the URLs

This approach if I am not wrong would not include the edge case where you have two cities with the same name in the region.

I have load the 5000 cities file and do some search with the ORM and I found some of this cases:

This is Country, Region and the count and the slugs that repeat:

Canada Quebec, Canada [{'count': 2, 'slug': u'pont-rouge'}]
Germany Bavaria, Germany [{'count': 2, 'slug': u'pocking'}]
Greece Attica, Greece [{'count': 2, 'slug': u'marousi'}]
India Andhra Pradesh, India [{'count': 2, 'slug': u'ramapuram'}]
India Madhya Pradesh, India [{'count': 2, 'slug': u'katangi'}]
India Maharashtra, India [{'count': 2, 'slug': u'jalgaon'}]
India Uttar Pradesh, India [{'count': 2, 'slug': u'tanda'}]
Mexico Yucatán, Mexico [{'count': 2, 'slug': u'panaba'}]
Pakistan Punjab, Pakistan [{'count': 2, 'slug': u'sahiwal'}]
Philippines Autonomous Region in Muslim Mindanao, Philippines [{'count': 2, 'slug': u'pagalungan'}]

Using this code:

for country in countries:
    for region in country.region_set.all():
        dupas = region.city_set.values('slug').annotate(count=Count('id')).order_by().filter(count__gt=1)
        if len(dupas) > 0:
            print country, region, dupas

The thing is, if you use the URL as you indicate, if someone enters the detail to a city where in the region there are two, your django view will generate an error, because it will try to do a get on a field where there are two results.

A search in the admin show this:

duplicated

¿How do you handle this cases?

jpic commented 10 years ago

I'm going to have a look but the first thing that comes to my mind is "that's weird" because City.Meta.unique_together has ('region', 'name')​.

unxavi commented 10 years ago

Hi,

I have been reviewing the code and looked at that too… I am not django or db expert or anything… but I think that: City.Meta.unique_together has ('region', 'name').

Can hold together, because if I look at the cases found (with the 5000 cities file) I see the actual names are non ascii characters different.

For example:

First case:

Country Canada, Region Quebec, Canada.

2 cities found with the slug “pont-rouge”

One with the actual name "Pont Rouge" and the second one "Pont-Roug".. I don’t know if this is normal, so the have different name, the unique together hold fine, but they both produce same slug, which break the view with the url:

/canada/quebec/pont-rougue

The others cases about the same:

Germany Bavaria

Slug: pocking

One city with the name “Pocking” and the other “Pocking”

Attica, Greece

Slug: marousi One city with the name “Marousi” and the other “Maroúsi”

Andhra Pradesh, India

Slug: ramapuram

One city with the name “Ramāpuram” and the other “Rāmāpuram”

And like this the rest of the duplicates.

Thanks very much for all your comments on the subject.

unxavi commented 10 years ago

Hi again,

Also I am trying to use the http://download.geonames.org/export/dump/cities1000.zip file for the import... but i keep getting this error:

File "manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/home/vagrant/project-env/lib/python2.7/site-packages/django/core/management/__init__.py", line 399, in execute_from_command_line
    utility.execute()
  File "/home/vagrant/project-env/lib/python2.7/site-packages/django/core/management/__init__.py", line 392, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/vagrant/project-env/lib/python2.7/site-packages/django/core/management/base.py", line 242, in run_from_argv
    self.execute(*args, **options.__dict__)
  File "/home/vagrant/project-env/lib/python2.7/site-packages/django/core/management/base.py", line 285, in execute
    output = self.handle(*args, **options)
  File "/home/vagrant/project-env/lib/python2.7/site-packages/django/db/transaction.py", line 399, in inner
    return func(*args, **kwargs)
  File "/home/vagrant/project-env/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 162, in handle
    self.city_import(items)
  File "/home/vagrant/project-env/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 321, in city_import
    city = City.objects.get(**kwargs)
  File "/home/vagrant/project-env/lib/python2.7/site-packages/django/db/models/manager.py", line 151, in get
    return self.get_queryset().get(*args, **kwargs)
  File "/home/vagrant/project-env/lib/python2.7/site-packages/django/db/models/query.py", line 310, in get
    (self.model._meta.object_name, num))
cities_light.models.MultipleObjectsReturned: get() returned more than one City -- it returned 2!

What do you think? may be a violation of foreing key or something like this?

Thanks again mate!

unxavi commented 10 years ago

James,

Sabes de alguna manera de evitar estos casos aislados?

Gracias

jpic commented 10 years ago

Well you can always skip the data you want using signals.

I'm currently fixing the bug.