Closed bernardko closed 7 years ago
I started by fixing the import in management/commands/cities.py.
from ...conf import (city_types, district_types, import_opts, import_opts_all,
HookException, settings, ALTERNATIVE_NAME_TYPES,
CONTINENT_DATA, CURRENCY_SYMBOLS, IGNORE_EMPTY_REGIONS,
INCLUDE_AIRPORT_CODES, NO_LONGER_EXISTENT_COUNTRY_CODES,
VALIDATE_POSTAL_CODES)
However this leads to another issue:
Importing postal codes: 10%|██████▌ | 120968/1228958 [00:01<15:31:50, 19.82it/s]duplicate key value violates unique constraint "cities_postalcode_slug_0af4e984_uniq"
DETAIL: Key (slug)=(none) already exists.
Traceback (most recent call last):
File "manage.py", line 12, in <module>
execute_from_command_line(sys.argv)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/core/management/__init__.py", line 367, in execute_from_command_line
utility.execute()
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/core/management/__init__.py", line 359, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/core/management/base.py", line 294, in run_from_argv
self.execute(*args, **cmd_options)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/core/management/base.py", line 345, in execute
output = self.handle(*args, **options)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/utils/decorators.py", line 185, in inner
return func(*args, **kwargs)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/cities/management/commands/cities.py", line 147, in handle
func()
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/cities/management/commands/cities.py", line 806, in import_postal_code
float(item['latitude'])))
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/models/manager.py", line 85, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/models/query.py", line 379, in get
num = len(clone)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/models/query.py", line 238, in __len__
self._fetch_all()
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/models/query.py", line 1087, in _fetch_all
self._result_cache = list(self.iterator())
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/models/query.py", line 54, in __iter__
results = compiler.execute_sql()
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 835, in execute_sql
cursor.execute(sql, params)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/backends/utils.py", line 79, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/backends/utils.py", line 59, in execute
self.db.validate_no_broken_transaction()
File "/Users/christophe/.virtualenvs/awoopa-app/lib/python2.7/site-packages/django/db/backends/base/base.py", line 428, in validate_no_broken_transaction
"An error occurred in the current transaction. You can't "
django.db.transaction.TransactionManagementError: An error occurred in the current transaction. You can't execute queries until the end of the 'atomic' block.
Don't have deep enough understanding of Django to tackle the bug
Fixed VALIDATE_POSTAL_CODES import issue.
Now same issue:
django.db.utils.IntegrityError: duplicate key value violates unique constraint "cities_district_slug_d08982a7_uniq" DETAIL: Key (slug)=(none) already exists.
unique=True on slug field will not work also with cities because same names across differents countries/regions.
@spout did you try removing the unique=True
from SlugModel ?
Or combine it by replacing self.id with self.code https://github.com/coderholic/django-cities/blob/master/cities/models.py#L279 (comment in #146)
@chris-y-meyers removed unique=True from SlugModel, it works.
@spout it worked for you ?
Have the procedure I advised you to follow implemented, still have a Key (slug)=(04114) already exists
error.
Did you change anything else ?
@chris-y-meyers: all was successfully imported. NB: I've installed django-cities directly from master branch.
This issue as well as #146 fixed in https://github.com/chris-y-meyers/django-cities.
Is there a point to slugs if they aren't unique?
@spout Slugs must be unique because their point is to be used as human-readable path part of URLs. If these generated slugs are not unique with the data I've picked then we need to pick different data to slugify for cities.
@blag Yes removing the unique=True
from SlugModel is obviously not a long term fix.
As brought up in #144, same postal codes can exist.
We need to think how to restructure the models. django-cities-light could help for inspiration. What I appreciate in this other derived project is that you can specify for which countries you want to limit data imports too.
To say I will be further looking into this issue.
Restructuring our models is definitely outside the scope of this PR, but I'm happy to review PRs for it. The problem with doing so is the import script needs to support all of the different model structures, and every restructure we do complicates the import script.
In terms of model structure, there's no large differences between this and django-cities-light. They solve some of the same problems in different ways (swapping out models for custom models), and I like their contrib
module. That's actually another thing I'm working on.
I'm working on #147 to properly fix the unicode issue. And while the tests are passing, the import script is not (yet).
Slugs most definitely don't need to be unique. Slugs for locations collide frequently, and even more so for postalcodes. That just means you need to add more information to the URL, such as the country slug too, to differentiate.
For example, I'd much prefer to have something like:
/us/newcastle /gb/newcastle
rather than
/newcastle-1 /newcastle-2
That makes sense - how should we handle AlternativeName
s?
Subregion slugs are not unique within their respective regions:
django.db.utils.IntegrityError: duplicate key value violates unique constraint "cities_subregion_region_id_e7611eb7_uniq"
DETAIL: Key (region_id, slug)=(2239858, kiwaba-nzoji) already exists.
where cities_subregion_region_id_e7611eb7_uniq
is:
ALTER TABLE public.cities_subregion
ADD CONSTRAINT cities_subregion_region_id_e7611eb7_uniq UNIQUE(region_id, slug);
Both of these subregions are in the same country, the same region, and have the same name:
http://www.geonames.org/2240688/kiwaba-nzoji.html http://www.geonames.org/10402309/kiwaba-nzoji.html
This means that both of them will have the same slug.
@coderholic Guidance please.
Yes, you can't add any unique constraints here at all. If you do want a completely unique identifier then you need to use the ID. When you want to use a slug you'll need to use application logic to do what you want to do (eg. either pick one based on largest population in the very rare case of a duplicate, or use the id and the slug in the url - this is pretty common, NNNNN-slug - or whatever makes sense in your application)
All, I have added slugs to all models in PR #147 (as well as fixing this issue, adding more tests, and fixing the unicode encoding issue).
It is possible for users to define their own global slugify function that takes the object and the default slug data:
def custom_slugify(obj, slug_data):
...
return modified_slug_data
that will allow users to specify their own slugify semantics. A default_slugify
function is defined cities/util.py
and is used by default.
Users can specify their own slugify function by setting the CITIES_SLUGIFY_FUNCTION
variable in their project's settings.py
.
(Edit: @all definitely doesn't work, so I'm at-mentioning you all by name to get your attention)
@bernardko, @chris-y-meyers, @spout: Please test the more-tests
branch out (in PR #147) and let me know if I broke anything.
If #147 looks good to all of you I'll merge it into master
and push a release to PyPI.
Thanks for the work @blag
I appreciate that setting the slug is now customizable. Unfortunately re-running import of data, after clean migration does not work:
django.db.utils.IntegrityError: duplicate key value violates unique constraint "cities_subregion_region_id_c3a3cdb7_uniq"
DETAIL: Key (region_id, name)=(2239858, Kiwaba Nzoji) already exists.
@chris-y-meyers Thanks for the feedback. I've had issues with those two slugs in the past; I fixed their slugify logic and added them to the test data so we don't break them in the future.
I've also removed the unique attributes on the slugs in migration 0009
, and added a migration (0010
) to remove that attribute only if it exists, so new and current users should still be able to upgrade cleanly (I'm actually kind of proud of that code, check it out here if you're interested).
If you don't mind testing these new changes again I would greatly appreciate it. You may need to:
git fetch --all
git checkout more-tests
git reset --hard origin/more-tests
since I modified previous commits.
Thank you for all of your help! 😄
Thank you @blag for your work !
So I retried, indeed had to merge migrations 0009 and 0010. However I now get:
../cities/management/commands/cities.py", line 509, in import_city
region=defaults['region'])
KeyError: 'region'
Checked the commands/cities.py
and seems like you don't set defaults['region']
in case of error.
FYI, last city added was Belmont.
And the regions_not_found.json
file looks like:
{
"GE": {
"00": [
"Goris Munitsip\u2019alit\u2019et\u2019i"
]
},
"TJ": {
"00": [
"Oktyabr\u2019skiy Rayon",
"Muminobodskiy Rayon",
"Gozimalikskiy Rayon"
]
},
"CD": {
"03": [
"Oicha",
"Makala District",
"Rutshuru Territory"
], "01": ["Sud Ubangi"],
"09": [
"Boma (city)"
],
"05": [
"Kolwezi City",
"Boma (city)",
"Tshikapa City",
"Tshangu District",
"Tshangu District",
"Tshangu District",
"Tshangu District",
"Tshangu District",
"Makala District",
"Rutshuru Territory",
"Rutshuru Territory"
]
}
}
@chris-y-meyers I think I fixed that issue. Can you test again?
The regions_not_found.json
file isn't relevant for debugging the import command. The data from Geonames is "dirty data", meaning it's input by human volunteers at the Geonames website. Some of the subregions imported from that data don't properly reference their region, so I added code to dump those problematic subregions (and not found regions) to regions_not_found.json
to help people fix the Geonames data.
Hi @blag,
Thanks for fixing the issue. Effectively import process progresses further, but now is an issue with duplicate post code entries:
django.db.utils.IntegrityError: duplicate key value violates unique constraint "cities_postalcode_country_id_6b5f8822_uniq"
DETAIL: Key (country_id, region_name, subregion_name, district_name, name)=(2510769, Andalucia, Almería, , Almeria) already exists.
@chris-y-meyers I fixed that, and did a bunch more work, mostly involving tests. I think everything should be working now, and we now have a healthy amount of tests that all pass. Can you test everything once more for me? I'm almost positive this will be the last time.
@blag can confirm that import process works right till end ! Congratulations and thanks again !
Finally! I'll merge it to master and push it to PyPI. Thank you for all your help! 😍
Closing - pushed to PyPI as version 0.5. Thank you all for your help!
running the following command on command line produces the following error:
Seems to be missing an import on the cities management command in the current master branch.