WeblateOrg / weblate

Web based localization tool with tight version control integration.
https://weblate.org/
GNU General Public License v3.0
4.54k stars 1.01k forks source link

Exception in update_index and rebuild_index: UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 4: ordinal not in range(128) #1166

Closed ihoru closed 8 years ago

ihoru commented 8 years ago

Steps to reproduce

$ ./manage.py update_index
Traceback (most recent call last):
  File "/opt/weblate/weblate/trans/search.py", line 176, in update_index
    update_source_unit_index(writer, unit)
  File "/opt/weblate/weblate/trans/search.py", line 102, in update_source_unit_index
    location=force_text(unit.location),
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 1255, in update_document
    IndexWriter.update_document(self, **fields)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 490, in update_document
    self.add_document(**fields)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 1251, in add_document
    self.commit()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 1229, in commit
    self.writer.commit(**self.commitargs)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 922, in commit
    finalsegments = self._merge_segments(mergetype, optimize, merge)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 827, in _merge_segments
    return mergetype(self, self.segments)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 101, in MERGE_SMALL
    writer.add_reader(reader)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 710, in add_reader
    self.add_postings_to_pool(reader, basedoc, docmap)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 647, in add_postings_to_pool
    for item in items:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 583, in _process_posts
    for fieldname, text, docnum, weight, vbytes in items:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/reading.py", line 427, in iter_postings
    m = self.postings(fieldname, btext)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/reading.py", line 821, in postings
    matcher = FilterMatcher(matcher, deleted, exclude=True)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/matching/wrappers.py", line 277, in __init__
    self._find_next()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/matching/wrappers.py", line 302, in _find_next
    while child.is_active() and child.id() in ids:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 980, in id
    self._read_ids()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 1082, in _read_ids
    self._read_data()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 1077, in _read_data
    self._data = loads(b)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 4: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "./manage.py", line 31, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/__init__.py", line 353, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/__init__.py", line 345, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/base.py", line 348, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/base.py", line 399, in execute
    output = self.handle(*args, **options)
  File "/opt/weblate/weblate/trans/management/commands/update_index.py", line 43, in handle
    self.do_update(options['limit'])
  File "/opt/weblate/weblate/trans/management/commands/update_index.py", line 93, in do_update
    update_index(units, source_units)
  File "/opt/weblate/weblate/trans/search.py", line 178, in update_index
    writer.close()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 1217, in close
    self.commit(restart=False)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 1229, in commit
    self.writer.commit(**self.commitargs)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 922, in commit
    finalsegments = self._merge_segments(mergetype, optimize, merge)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 827, in _merge_segments
    return mergetype(self, self.segments)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 101, in MERGE_SMALL
    writer.add_reader(reader)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 710, in add_reader
    self.add_postings_to_pool(reader, basedoc, docmap)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 647, in add_postings_to_pool
    for item in items:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 583, in _process_posts
    for fieldname, text, docnum, weight, vbytes in items:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/reading.py", line 427, in iter_postings
    m = self.postings(fieldname, btext)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/reading.py", line 821, in postings
    matcher = FilterMatcher(matcher, deleted, exclude=True)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/matching/wrappers.py", line 277, in __init__
    self._find_next()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/matching/wrappers.py", line 302, in _find_next
    while child.is_active() and child.id() in ids:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 980, in id
    self._read_ids()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 1082, in _read_ids
    self._read_data()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 1077, in _read_data
    self._data = loads(b)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 4: ordinal not in range(128)

$ ./manage.py rebuild_index --all
Processing 0.0%
Processing 7.5%
Processing 14.9%
Processing 22.4%
Processing 29.9%
Processing 37.4%
Processing 44.8%
Processing 52.3%
Processing 59.8%
Processing 67.2%
Processing 74.7%
Processing 82.2%
Processing 89.7%
Processing 97.1%
Operation completed
Traceback (most recent call last):
  File "./manage.py", line 31, in <module>
    execute_from_command_line(sys.argv)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/__init__.py", line 353, in execute_from_command_line
    utility.execute()
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/__init__.py", line 345, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/base.py", line 348, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/usr/local/lib/python3.5/dist-packages/django/core/management/base.py", line 399, in execute
    output = self.handle(*args, **options)
  File "/opt/weblate/weblate/trans/management/commands/rebuild_index.py", line 89, in handle
    target_writers[code].commit()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 922, in commit
    finalsegments = self._merge_segments(mergetype, optimize, merge)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 827, in _merge_segments
    return mergetype(self, self.segments)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 101, in MERGE_SMALL
    writer.add_reader(reader)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 710, in add_reader
    self.add_postings_to_pool(reader, basedoc, docmap)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 647, in add_postings_to_pool
    for item in items:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/writing.py", line 583, in _process_posts
    for fieldname, text, docnum, weight, vbytes in items:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/reading.py", line 427, in iter_postings
    m = self.postings(fieldname, btext)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/reading.py", line 821, in postings
    matcher = FilterMatcher(matcher, deleted, exclude=True)
  File "/usr/local/lib/python3.5/dist-packages/whoosh/matching/wrappers.py", line 277, in __init__
    self._find_next()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/matching/wrappers.py", line 302, in _find_next
    while child.is_active() and child.id() in ids:
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 980, in id
    self._read_ids()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 1082, in _read_ids
    self._read_data()
  File "/usr/local/lib/python3.5/dist-packages/whoosh/codec/whoosh3.py", line 1077, in _read_data
    self._data = loads(b)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 4: ordinal not in range(128)

Server configuration

$ ./manage.py list_versions
 * Weblate weblate-2.6-324-g16e3843
 * Python 3.5.1+
 * Django 1.9.7
 * six 1.10.0
 * python-social-auth 0.2.19
 * Translate Toolkit 2.0.0b2
 * Whoosh 2.7.4
 * Git 2.7.4
 * Pillow (PIL) 1.1.7
 * dateutil 2.5.3
 * lxml 3.6.0
 * django-crispy-forms 1.6.0
 * compressor 1.6
 * djangorestframework 3.3.3
 * pytz 2016.4
 * pyuca N/A
 * python-bidi 0.4.0
 * pyLibravatar N/A
 * Database backends: django.db.backends.mysql
ihoru commented 8 years ago

Don't pay attention.