Closed ihoru closed 7 years ago
Maybe your database was not created as utf-8 one? See https://docs.weblate.org/en/latest/admin/install.html#creating-database-in-mysql
It's UTF-8 as you can see.
MariaDB [weblate]> show create database weblate;
+----------+------------------------------------------------------------------+
| Database | Create Database |
+----------+------------------------------------------------------------------+
| weblate | CREATE DATABASE `weblate` /*!40100 DEFAULT CHARACTER SET utf8 */ |
+----------+------------------------------------------------------------------+
1 row in set (0.00 sec)
MariaDB [weblate]> SELECT TABLE_NAME, TABLE_COLLATION FROM information_schema.tables WHERE table_schema = DATABASE();
+--------------------------------------+-----------------+
| TABLE_NAME | TABLE_COLLATION |
+--------------------------------------+-----------------+
| accounts_autogroup | utf8_general_ci |
| accounts_profile | utf8_general_ci |
| accounts_profile_languages | utf8_unicode_ci |
| accounts_profile_secondary_languages | utf8_unicode_ci |
| accounts_profile_subscriptions | utf8_unicode_ci |
| accounts_verifiedemail | utf8_general_ci |
| auth_group | utf8_general_ci |
| auth_group_permissions | utf8_unicode_ci |
| auth_permission | utf8_general_ci |
| auth_user | utf8_general_ci |
| auth_user_groups | utf8_unicode_ci |
| auth_user_user_permissions | utf8_unicode_ci |
| authtoken_token | utf8_general_ci |
| django_admin_log | utf8_general_ci |
| django_content_type | utf8_general_ci |
| django_migrations | utf8_general_ci |
| django_session | utf8_general_ci |
| django_site | utf8_general_ci |
| lang_language | utf8_general_ci |
| social_auth_association | utf8_general_ci |
| social_auth_code | utf8_general_ci |
| social_auth_nonce | utf8_general_ci |
| social_auth_usersocialauth | utf8_general_ci |
| trans_advertisement | utf8_general_ci |
| trans_change | utf8_general_ci |
| trans_check | utf8_general_ci |
| trans_comment | utf8_general_ci |
| trans_componentlist | utf8_general_ci |
| trans_componentlist_components | utf8_general_ci |
| trans_dictionary | utf8_general_ci |
| trans_groupacl | utf8_general_ci |
| trans_groupacl_groups | utf8_general_ci |
| trans_indexupdate | utf8_unicode_ci |
| trans_project | utf8_general_ci |
| trans_project_owners | utf8_unicode_ci |
| trans_source | utf8_general_ci |
| trans_subproject | utf8_general_ci |
| trans_suggestion | utf8_general_ci |
| trans_translation | utf8_general_ci |
| trans_unit | utf8_general_ci |
| trans_vote | utf8_unicode_ci |
| trans_whiteboardmessage | utf8_general_ci |
+--------------------------------------+-----------------+
42 rows in set (0.00 sec)
@nijel any further questions?
Looking at http://stackoverflow.com/questions/3715865/unicodeencodeerror-ascii-codec-cant-encode-character it might be caused by configured locales. Can you try setting utf-8 ones before starting Weblate?
export LANG='en_US.UTF-8'
export LC_ALL='en_US.UTF-8'
I've tried all ways of solving this problem that were suggested on that page (and other on the Internet), but script still fails. :(
$ env | egrep 'LANG|LC_'
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8
LC_LANG=en_US.UTF-8
charset utf-8;
to nginx.confenv = LANG=en_US.UTF-8
env = LC_ALL=en_US.UTF-8
env = LC_LANG=en_US.UTF-8
And restarted all that stuff and tested it on the web.
Maybe the python-mysqldb version you are using is problematic?
* Weblate weblate-2.6
* Python 2.7.11+
* Django 1.9.5
* six 1.10.0
* python-social-auth 0.2.18
* Translate Toolkit 1.14.0-rc1
* Whoosh 2.7.4
* Git 2.7.4
* Pillow (PIL) 1.1.7
* dateutil 2.5.3
* lxml 3.6.0
* django-crispy-forms 1.6.0
* compressor 1.6
* djangorestframework 3.3.3
* pytz 2016.4
* pyuca N/A
* pyLibravatar N/A
* Mercurial 3.7.3
* Database backends: django.db.backends.mysql
$ dpkg -s python-mysqldb | grep Version
Version: 1.3.7-1build2
This symbol is the reason of my issue:
https://github.com/nijel/weblate/blob/e525ea0409b943c1cdcd1e23653bc441fdef74a8/weblate/trans/util.py#L57
PLURAL_SEPARATOR = '\x1e\x1e'
The problem is that the database driver thinks it needs to use ASCII. This might be first chars where you hit the problem, but it's going to cause problems with any non-ASCII translation....
There are many other utf-8 symbols also (like three dots for example) and it does well. How can I make database driver to use UTF-8 instead of ASCII? It looks like this issue appeared when I moved code to another server (new: Ubuntu 16.04, old: Ubuntu 15) and imported date to MariaDB instead of MySQL previously.
2016-06-21 14:50 GMT+03:00 Michal Čihař notifications@github.com:
The problem is that the database driver thinks it needs to use ASCII. This might be first chars where you hit the problem, but it's going to cause problems with any non-ASCII translation....
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/nijel/weblate/issues/1106#issuecomment-227416852, or mute the thread https://github.com/notifications/unsubscribe/AAWFtDjum4M282nOIxBX7YLi_wXGmOLdks5qN8_xgaJpZM4IaLqu .
С уважением, Игорь Поляков.
Maybe there is some issue it the MySQL library you've used on the old server, but this is valid unicode char which should not cause any problems (see http://unicode.org/cldr/utility/character.jsp?a=001E)
In the end it turned out to be MySQL Unicode issue, it's now documented at https://docs.weblate.org/en/latest/admin/install.html#unicode-issues-in-mysql
Steps to reproduce
Actual behaviour
Process stops.
Expected behaviour
New language should be added.
Server configuration