Closed Yom86 closed 7 years ago
I have also tried using: import sys encoding='utf-8' reload(sys) sys.setdefaultencoding(encoding)
but that gives another error:
Traceback (most recent call last):
File "manage.py", line 10, in
Could you try replacing '%a, %d %b %Y %H:%M:%S %Z'
by b'%a, %d %b %Y %H:%M:%S %Z'
?
Migth be fixed by #77
Hi jpic, thank you for your app, it is great!
However I am encountering an UnicodeDecodeError when trying to populate my database. Here is my config:
Ubuntu 14.04 Trusty Tahr Python 2.7.6 Django 1.7.4 django-cities-light 3.0.4 PostgreSQL 9.4.0
Traceback (most recent call last):one: 10%|### |
File "./manage-prod.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/webapps/genida/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 385, in execute_from_command_line
utility.execute()
File "/webapps/genida/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 377, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/webapps/genida/local/lib/python2.7/site-packages/django/core/management/base.py", line 288, in run_from_argv
self.execute(*args, **options.__dict__)
File "/webapps/genida/local/lib/python2.7/site-packages/django/core/management/base.py", line 338, in execute
output = self.handle(*args, **options)
File "/webapps/genida/local/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 163, in handle
self.region_import(items)
File "/webapps/genida/local/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 292, in region_import
self.save(region)
File "/webapps/genida/local/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 483, in save
self.logger.warning('Saving %s failed: %s' % (model, e))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 27: ordinal not in range(128)
It always happen at about 10% of the second progress bar, which after a little search seems to correspond to this City: Malinovka, in Belarus country.
I think the problem is that this city does not have a Region attached to it (as well as other cities in Belarus, Minsk region).
In my development environment, using the exact same config except for PostgreSQL -> SQLite3, I get an warning message at these same 10% : No handlers could be found for logger "cities_light"
, but the imports continue without error, nor traceback printing. Malinovka and others are just imported without Region key.
I already tried to remove site.py, but it changed nothing.
Have you any idea how I can make this work on my production server? Can I filter out the cities that do not have regions? Can I force the dowload/import to continue even if errors are encountered?
Regards, TM
Maybe they are attached to the duplicate of Minsk, did you get that too:
Saving Minsk, Belarus failed: duplicate key value violates unique constraint "cities_light_region_country_id_name_key" DETAIL: Key (country_id, name)=(36, Minsk) already exists.
In which case, we might be able to fix it be mapping duplicate ids to existing ids so that cities which are attached to a duplicate region be attached to the initial region.
We can see the problem in Admin1CodesASCII.txt:
BY.05 Minsk Minsk 625142 BY.04 Minsk Minsk 625143
What do you think ?
Which Python version and CITIES_LIGHT_CITY_SOURCES ?
I just got a complete import with Python 2.7.9 on Postgresql 9.4.0 and cities5000.zip on Arch Linux:
./manage.py cities_light --settings=test_project.settings_postgres System check identified some issues:
WARNINGS: ?: (1_6.W001) Some project unittests may not execute as expected. HINT: Django 1.6 introduced a new default test runner. It looks like this project was generated using Django 1.5 or earlier. You should ensure your tests are all running & behaving as expected. See https://docs.djangoproject.com/en/dev/releases/1.6/#new-test-runner for more information. Assuming local download is up to date for http://download.geonames.org/export/dump/countryInfo.txt Forced import of countryInfo.txt because data do not seem to have installed successfuly yet, note that this is equivalent to --force-import-all. Importing countryInfo.txt RAM used: 32268 kB Time: 0:00:01 Done: 100% | ########################################################################################################## | Assuming local download is up to date for http://download.geonames.org/export/dump/admin1CodesASCII.txt Forced import of admin1CodesASCII.txt because data do not seem to have installed successfuly yet, note that this is equivalent to --force-import-all. Importing admin1CodesASCII.txt Saving Minsk, Belarus failed: duplicate key value violates unique constraint "cities_light_region_country_id_name_key" |
---|
DETAIL: Key (country_id, name)=(36, Minsk) already exists.
Saving Vientiane, Laos failed: duplicate key value violates unique constraint "cities_light_region_country_id_name_key" | DETAIL: Key (country_id, name)=(127, Vientiane) already exists.
Saving Daugavpils, Latvia failed: duplicate key value violates unique constraint "cities_light_region_country_id_name_key" | DETAIL: Key (country_id, name)=(136, Daugavpils) already exists.
Saving Ash Sharqiyah, Oman failed: duplicate key value violates unique constraint "cities_light_region_country_id_slug_key" | DETAIL: Key (country_id, slug)=(173, ash-sharqiyah) already exists.
Saving Al Batinah, Oman failed: duplicate key value violates unique constraint "cities_light_region_country_id_slug_key" DETAIL: Key (country_id, slug)=(173, al-batinah) already exists.
Saving Lima, Peru failed: duplicate key value violates unique constraint "cities_light_region_country_id_name_key" | DETAIL: Key (country_id, name)=(175, Lima) already exists.
RAM used: 32496 kB Time: 0:00:26 Done: 100%|######################|Downloading http://download.geonames.org/export/dump/cities5000.zip into /home/jpic/env/src/cities-light/cities_light/data/cities5000.zip Extracting cities5000.txt from /home/jpic/env/src/cities-light/cities_light/data/cities5000.zip into /home/jpic/env/src/cities-light/cities_light/data/cities5000.txt Forced import of cities5000.zip because data do not seem to have installed successfuly yet, note that this is equivalent to --force-import-all. Importing cities5000.zip Saving Pont-Rouge, Quebec, Canada failed: duplicate key value violates unique constraint "cities_light_city_region_id_slug_key" DETAIL: Key (region_id, slug)=(455, pont-rouge) already exists.
Saving Pöcking, Bavaria, Germany failed: duplicate key value violates unique constraint "cities_light_city_region_id_slug_key" DETAIL: Key (region_id, slug)=(712, pocking) already exists.
Saving Tāndā, Uttar Pradesh, India failed: duplicate key value violates unique constraint "cities_light_city_region_id_slug_key" DETAIL: Key (region_id, slug)=(1284, tanda) already exists.
Saving Ramāpuram, Andhra Pradesh, India failed: duplicate key value violates unique constraint "cities_light_city_region_id_slug_key" DETAIL: Key (region_id, slug)=(1314, ramapuram) already exists.
Saving Katāngi, Madhya Pradesh, India failed: duplicate key value violates unique constraint "cities_light_city_region_id_slug_key" DETAIL: Key (region_id, slug)=(1298, katangi) already exists.
Saving Jalgaon, Maharashtra, India failed: duplicate key value violates unique constraint "cities_light_city_region_id_slug_key" DETAIL: Key (region_id, slug)=(1297, jalgaon) already exists.
Skipping because of invalid region: [u'3946820', u'Barranca', u'Barranca', u'Barranca', u'-10.75', u'-77.76667', u'P', u'PPLA3', u'PE', u'', u'15', u'1502', u'150201', u'', u'46290', u'', u'52', u'America/Lima', u'2012-07-19'] Saving Pagalungan, Autonomous Region in Muslim Mindanao, Philippines failed: duplicate key value violates unique constraint "cities_light_city_region_id_slug_key" DETAIL: Key (region_id, slug)=(2540, pagalungan) already exists.
Saving Sahiwal, Punjab, Pakistan failed: duplicate key value violates unique constraint "cities_light_city_region_id_slug_key" | DETAIL: Key (region_id, slug)=(2559, sahiwal) already exists.
/home/jpic/env/lib/python2.7/site-packages/autoslug/utils.py:30: RuntimeWarning: Argument <type 'str'> is not an unicode object. Passing an encoded string will likely have unexpected results. return django_slugify(unidecode(value))
RAM used: 39024 kB Time: 0:03:24 Done: 100%|##########################################################################################################| Downloading http://download.geonames.org/export/dump/alternateNames.zip into /home/jpic/env/src/cities-light/cities_light/data/alternateNames.zip Extracting alternateNames.txt from /home/jpic/env/src/cities-light/cities_light/data/alternateNames.zip into /home/jpic/env/src/cities-light/cities_light/data/alternateNames.txt Forced import of alternateNames.zip because data do not seem to have installed successfuly yet, note that this is equivalent to --force-import-all. Importing alternateNames.zip RAM used: 234228 kB Time: 0:13:15 Done: 100%|#########################################################################################################| Importing parsed translation in the database RAM used: 234228 kB Time: 0:00:51 Done: 100%|#########################################################################################################|
Python 2.7.6 and default settings.
Yes I read your Travis build and saw these 'duplicate key' violations. The thing is I do not have a logger handler, because I do not know how to set one, so these messages are not displayed, but I think this is it.
I will try to set a logger handler (with LOGGING setting, am I right?), to see if I actually get the same violation messages. I could also try to update my Python version to 2.7.9 like yours. And the idea to replace duplicated region IDs with initial ones when saving cities might be great, I will try to see if I can help with this :smiley:
Thank you for your very quick answer time :smile:
On Fri, Feb 6, 2015 at 1:03 PM, Timothée Mazzucotelli < notifications@github.com> wrote:
(Python 2.7.6) Yes I read your Travis build and saw these 'duplicate key' violations. The thing is I do not have a logger handler, because I do not know how to set one, so these messages are not displayed, but I think this is it.
That's part of the test_project settings:
LOGGING = { 'version': 1, 'disable_existing_loggers': False, 'filters': { 'require_debug_false': { '()': 'django.utils.log.RequireDebugFalse' } }, 'handlers': { 'mail_admins': { 'level': 'ERROR', 'filters': ['require_debug_false'], 'class': 'django.utils.log.AdminEmailHandler' }, 'console':{ 'level':'DEBUG', 'class':'logging.StreamHandler', }, }, 'loggers': { 'django.request': { 'handlers':['console'], 'propagate': True, 'level':'DEBUG', }, 'cities_light': { 'handlers':['console'], 'propagate': True, 'level':'DEBUG', }, } }
I will try to set a logger handler (with LOGGING setting, am I right?), to see if I actually get the same violation messages. I could also try to update my Python version to 2.7.9 like yours.
Worth trying !
And the idea to replace duplicated region IDs with initial ones when saving cities might be great, I will try to see if I can help with this
Ok then, I'll wait for your pull request, but let's try to isolate the exact problem first ;)
Hi again.
I set up my LOGGING setting, and I get the same messages as you in my dev environment, but nothing changed on the production server:
./manage-prod.py cities_light --force-all
Downloading http://download.geonames.org/export/dump/countryInfo.txt into /webapps/genida/lib/python2.7/site-packages/cities_light/data/countryInfo.txt
Forced import of countryInfo.txt because data do not seem to have installed successfuly yet, note that this is equivalent to --force-import-all.
Importing countryInfo.txt
RAM used: 51684 kB Time: 0:00:01 Done: 100%|##########################################################################################################|
Downloading http://download.geonames.org/export/dump/admin1CodesASCII.txt into /webapps/genida/lib/python2.7/site-packages/cities_light/data/admin1CodesASCII.txt
Forced import of admin1CodesASCII.txt because data do not seem to have installed successfuly yet, note that this is equivalent to --force-import-all.
Importing admin1CodesASCII.txt
Traceback (most recent call last):one: 10%|########### |
File "./manage-prod.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/webapps/genida/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 385, in execute_from_command_line
utility.execute()
File "/webapps/genida/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 377, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/webapps/genida/local/lib/python2.7/site-packages/django/core/management/base.py", line 288, in run_from_argv
self.execute(*args, **options.__dict__)
File "/webapps/genida/local/lib/python2.7/site-packages/django/core/management/base.py", line 338, in execute
output = self.handle(*args, **options)
File "/webapps/genida/local/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 163, in handle
self.region_import(items)
File "/webapps/genida/local/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 292, in region_import
self.save(region)
File "/webapps/genida/local/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 483, in save
self.logger.warning('Saving %s failed: %s' % (model, e))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 27: ordinal not in range(128)
Next step: trying Python 2.7.9
On Fri, Feb 6, 2015 at 3:53 PM, Timothée Mazzucotelli < notifications@github.com> wrote:
self.logger.warning('Saving %s failed: %s' % (model, e))
Try replacing it with
self.logger.warning(u'Saving %s failed: %s' % (model, e))
We might have to do that in a bunch of places though ^^
http://yourlabs.org http://blog.yourlabs.org Customer is king - Le client est roi - El cliente es rey.
Hello!
I added u'' for each logger string in my fork. But it does not work :cry:
File "/webapps/genida/local/lib/python2.7/site-packages/cities_light/management/commands/cities_light.py", line 483, in save
self.logger.warning(u'Saving %s failed: %s' % (model, e))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 27: ordinal not in range(128)
Ça me rend fou :hammer:
(I checked that my DB is using UTF-8 encoding, and it is.)
I also noticed that we have self.logger.warning
and self.logger.warn
, I replace warn by warning, but nothing changed.
I just tried to populate an SQLite3 DB on my production server:
Saving Minsk, Belarus failed: UNIQUE constraint failed: cities_light_region.country_id, cities_light_region.slug
It tells us that the problem actually comes from PostgreSQL, right? Using SQLite3 just raise an exception and display a message, but when we use PostgresSQL the message is wrong in some way, provoking a UnicodeDecodeError...
Further investigation is underway :hamster:
I was able to populate correctly by not displaying the exception: self.logger.warning(u'Saving %s failed' % model)
but it is only a temporary solution
It tells us that the problem actually comes from PostgreSQL, right?
It could be that SQLite fails before PostgreSQL already.
You could also try decode('utf-8')
on the strings.
I suggest inspecting it with ipdb
.
First, pip install ipdb
and then add import ipdb; ipdb.set_trace()
to before where it fails, maybe wrapping the problematic line in some try
block to only trigger ipdb
when it fails.
Then you can inspect and see what model
and e
is here.
try:
self.logger.warning('Saving %s failed: %s' % (model, e))
except:
import ipdb; ipdb.set_trace()
Provisional solution: In venv/lib/python2.7/site.py, change the value of encoding variable: from encoding = "ascii" to encoding = "utf-8"
If you had to do that then there's another problem in your env.
I think I've also got this problem while trying to fix #110. Exception looks similar:
File "/Users/user/play/bf/django-cities-light/cities_light/management/commands/cities_light.py", line 457, in city_import
force_update=force_update
File "/Users/user/play/bf/django-cities-light/cities_light/management/commands/cities_light.py", line 574, in save
self.logger.warning('Saving %s failed: %s' % (model, e.message))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc4 in position 130: ordinal not in range(128)
The reason seems to be underlying IntegrityError exception which in turn produces UnicodeDecodeError when logged:
IntegrityError('duplicate key value violates unique constraint "cities_light_city_region_id_name_key"\nDETAIL: Key (region_id, name)=(61365, Shams\xc4\x81b\xc4\x81d) already exists.\n',)
I don't know how to fix this yet, but plan to do this.
@Yom86 @Pawamoy @Canicio Guys, could you please try this branch https://github.com/yourlabs/django-cities-light/tree/drop-force-text and see if it fixes the problem for you?
See this Django issue for more details: https://code.djangoproject.com/ticket/20572
Could anyone test this? Works for me, but I'm a bit hesitant to merge the patch.
Should we perhaps setup a basic test fixture with this case and see if we can reproduce this ?
We added the test to reproduce this issue (see https://github.com/yourlabs/django-cities-light/tree/drop-force-text branch), but it is not triggered on travis, although I can run it locally:
$ CI=true DB_ENGINE=postgresql_psycopg2 DB_NAME=cities_light_test DB_USER=postgres .tox/py27-django19-postgresql/bin/python test_project/manage.py test cities_light.tests.test_unicode_decode_error
Creating test database for alias 'default'...
Assuming local download is up to date for file:///Users/user/play/bf/django-cities-light/cities_light/tests/fixtures/kemerovo_country.txt
Importing kemerovo_country.txt
RAM used: 50515968 kB ETA: --:--:-- Done: 0%| |Saving Russia
/Users/user/play/bf/django-cities-light/.tox/py27-django19-postgresql/lib/python2.7/site-packages/unidecode/__init__.py:46: RuntimeWarning: Argument <type 'str'> is not an unicode object. Passing an encoded string will likely have unexpected results.
_warn_if_not_unicode(string)
RAM used: 50536448 kB Time: 0:00:00 Done: 100%|################################################################################################################################################################|
Assuming local download is up to date for file:///Users/user/play/bf/django-cities-light/cities_light/tests/fixtures/kemerovo_region.txt
Importing kemerovo_region.txt
RAM used: 50536448 kB ETA: --:--:-- Done: 0%| |Saving Кемерово
RAM used: 50544640 kB ETA: 0:00:00 Done: 50%|################################################################################ |Saving Кемерово
E
======================================================================
ERROR: test_unicode_decode_error (cities_light.tests.test_unicode_decode_error.TestUnicodeDecodeError)
.
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Users/user/play/bf/django-cities-light/.tox/py27-django19-postgresql/lib/python2.7/site-packages/mock/mock.py", line 1305, in patched
return func(*args, **keywargs)
File "/Users/user/play/bf/django-cities-light/cities_light/tests/test_unicode_decode_error.py", line 28, in test_unicode_decode_error
call_command('cities_light', force_import_all=True)
File "/Users/user/play/bf/django-cities-light/.tox/py27-django19-postgresql/lib/python2.7/site-packages/django/core/management/__init__.py", line 119, in call_command
return command.execute(*args, **defaults)
File "/Users/user/play/bf/django-cities-light/.tox/py27-django19-postgresql/lib/python2.7/site-packages/django/core/management/base.py", line 399, in execute
output = self.handle(*args, **options)
File "/Users/user/play/bf/django-cities-light/cities_light/management/commands/cities_light.py", line 173, in handle
self.region_import(items)
File "/Users/user/play/bf/django-cities-light/cities_light/management/commands/cities_light.py", line 309, in region_import
self.save(region)
File "/Users/user/play/bf/django-cities-light/cities_light/management/commands/cities_light.py", line 522, in save
self.logger.warning('Saving %s failed: %s' % (model, e))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 130: ordinal not in range(128)
----------------------------------------------------------------------
Ran 1 test in 0.239s
FAILED (errors=1)
Destroying test database for alias 'default'...
Running via tox also does not trigger the issue:
tox -e py27-django19-postgresql -- -s cities_light/tests/test_unicode_decode_error.py
It looks like tox always uses sqlite backend, you can verify this by adding this statement:
--- a/cities_light/tests/test_unicode_decode_error.py
+++ b/cities_light/tests/test_unicode_decode_error.py
@@ -24,4 +24,5 @@ class TestUnicodeDecodeError(test.TransactionTestCase):
@mock_source('country', 'kemerovo_country')
def test_unicode_decode_error(self):
"""."""
+ from django.conf import settings; print settings.DATABASES
call_command('cities_light', force_import_all=True)
cities_light/tests/test_unicode_decode_error.py {'default': {'CONN_MAX_AGE': 0, 'ENGINE': 'django.db.backends.sqlite3', 'PASSWORD': '', 'TEST': {'COLLATION': None, 'MIRROR': None, 'NAME': None, 'CHARSET': None}, 'USER': '', 'ATOMIC_REQUESTS': False, 'AUTOCOMMIT': True, 'NAME': ':memory:', 'PORT': '', 'TIME_ZONE': None, 'OPTIONS': {}, 'HOST': ''}}
@jpic Could you please look into the tox configuration file? I can't understand why it always uses sqlite backend despite being told to use postgresql environment.
@jpic James?
@jpic Did you figured out why tox and travis always use sqlite? I'm curious :)
You are absolutely correct, multiline here is not supported: https://github.com/yourlabs/django-cities-light/blob/stable/3.x.x/tox.ini#L65
After fixing this, we can see the test failing as expected when running them with tox.
Pushed a fix on drop-force-text in commit 154f774
It turns out the patch wasn't working very well.
Running the unicode test individually would show the error, even in tox. Running it as part of the suite it wouldn't. This fixes it:
--- a/cities_light/tests/test_unicode_decode_error.py
+++ b/cities_light/tests/test_unicode_decode_error.py
@@ -12,7 +12,7 @@ FIXTURE_DIR = os.path.abspath(os.path.join(BASE_DIR, 'tests', 'fixtures'))
def mock_source(setting, short_name): # noqa
return mock.patch(
- 'cities_light.settings.%s_SOURCES' %
+ 'cities_light.management.commands.cities_light.%s_SOURCES' %
setting.upper(), ['file://%s/%s.txt' % (FIXTURE_DIR, short_name)])
https://docs.python.org/3/library/unittest.mock.html#where-to-patch
Should be fixed by #130
Hello,
I'm trying to run cities-light on Python 2.7 on windows.
I followed all the steps and everything works fine until I run the command "python manage.py cities_light".
i get: Traceback (most recent call last): File "manage.py", line 10, in
execute_from_command_line(sys.argv)
File "C:\Python27\lib\site-packages\django\core\managementinit.py", line
399, in execute_from_command_line
utility.execute()
File "C:\Python27\lib\site-packages\django\core\managementinit.py", line
392, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "C:\Python27\lib\site-packages\django\core\management\base.py", line 242,
in run_from_argv
self.execute(_args, _options.dict)
File "C:\Python27\lib\site-packages\django\core\management\base.py", line 285,
in execute
output = self.handle(_args, _options)
File "C:\Python27\lib\site-packages\django\db\transaction.py", line 431, in in
ner
return func(_args, *_kwargs)
File "C:\Python27\lib\site-packages\cities_light\management\commands\cities_li
ght.py", line 128, in handle
geonames = Geonames(url, force=force)
File "C:\Python27\lib\site-packages\cities_light\geonames.py", line 30, in i
nit
self.downloaded = self.download(url, self.file_path, force)
File "C:\Python27\lib\site-packages\cities_light\geonames.py", line 49, in dow
nload
'%a, %d %b %Y %H:%M:%S %Z')
File "C:\Python27\lib_strptime.py", line 468, in _strptime_time
return _strptime(data_string, format)[0]
File "C:\Python27\lib_strptime.py", line 309, in _strptime
format_regex = _TimeRE_cache.compile(format)
File "C:\Python27\lib_strptime.py", line 266, in compile
return re_compile(self.pattern(format), IGNORECASE)
File "C:\Python27\lib_strptime.py", line 260, in pattern
self[format[directive_index]])
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 34: ordinal
not in range(128)
I read all I could find on this matter and I understood it's probably a problem with Windows. I deleted the file site.py (i read to do so in one of your answers) but then the error indicated that site.py was missing... I've tried all I could think of and all I read but couldn't solve this one!
Please help!
PS: I was first using SQLite3, now i moved to MySQL with utf8_unicode_ci
Thank you for your support!