WikiTeam / wikiteam

Tools for downloading and preserving wikis. We archive wikis, from Wikipedia to tiniest wikis. As of 2024, WikiTeam has preserved more than 600,000 wikis.
https://github.com/WikiTeam
GNU General Public License v3.0
730 stars 151 forks source link

UnicodeEncodeError: 'ascii' codec can't encode character #342

Open unz3r0 opened 5 years ago

unz3r0 commented 5 years ago
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 36: ordinal not in range(128)
$> python --version
Python 2.7.12

This solved my problem:

Adding this to the dumpgenerator:

import sys
# encoding=utf8
reload(sys)
sys.setdefaultencoding('utf8')

https://markhneedham.com/blog/2015/05/21/python-unicodeencodeerror-ascii-codec-cant-encode-character-uxfc-in-position-11-ordinal-not-in-range128/

dennisroczek commented 4 years ago

hey @heerwinus THANKS for your lines. I needed to grab a wiki and it broke when downloading images. Hence I create a Pull Request as your changes worked like a charm for me!