getpelican / pelican

Static site generator that supports Markdown and reST syntax. Powered by Python.
https://getpelican.com
GNU Affero General Public License v3.0
12.53k stars 1.81k forks source link

Can't support chinese (simplified) language #1571

Closed liukeyou closed 9 years ago

liukeyou commented 9 years ago

When the article include chinese , the pelican show the this error. ERROR: Could not process pages\hadoop.md | 'utf8' codec can't decode byte 0xd6 in position 124: invalid continuation by te


pelican version 3.5 OS : Windows 8.1 Python : 2.7.7 C:\Users\xxxxx>locale LANG= LC_CTYPE="C.UTF-8" LC_NUMERIC="C.UTF-8" LC_TIME="C.UTF-8" LC_COLLATE="C.UTF-8" LC_MONETARY="C.UTF-8" LC_MESSAGES="C.UTF-8" LC_ALL=

avaris commented 9 years ago

Is your hadoop.md file utf-8? Also, can you run pelican with --debug and post the full traceback?

liukeyou commented 9 years ago

Thanks for your quick reply.

The error stack : E:\dev\opensource\myblog\blog>pelican --debug DEBUG: Adding current directory to system path DEBUG: Temporarily adding PLUGIN_PATHS to system path DEBUG: Restoring system path DEBUG: Template list: [u'!simple/archives.html', u'!simple/article.html', u'!sim ple/author.html', u'!simple/authors.html', u'!simple/base.html', u'!simple/categ ories.html', u'!simple/category.html', u'!simple/gosquared.html', u'!simple/inde x.html', u'!simple/page.html', u'!simple/pagination.html', u'!simple/period_arch ives.html', u'!simple/tag.html', u'!simple/tags.html', u'!simple/translations.ht ml', u'archives.html', u'article.html', u'article_list.html', u'author.html', u' authors.html', u'base.html', u'categories.html', u'category.html', u'gosquared.h tml', u'includes/aboutme.html', u'includes/addthis.html', u'includes/article_inf o.html', u'includes/cc-license.html', u'includes/comment_count.html', u'includes /comments.html', u'includes/disqus_script.html', u'includes/footer.html', u'incl udes/ga.html', u'includes/github-js.html', u'includes/github.html', u'includes/l inks.html', u'includes/liquid_tags_nb_header.html', u'includes/pagination.html', u'includes/piwik.html', u'includes/related-posts.html', u'includes/series.html' , u'includes/sidebar.html', u'includes/taglist.html', u'includes/translations.ht ml', u'includes/twitter_cards.html', u'includes/twitter_timeline.html', u'index. html', u'page.html', u'pagination.html', u'period_archives.html', u'tag.html', u 'tags.html', u'translations.html'] DEBUG: Read file pages\about.md -> Page DEBUG: Signal page_generator_preread.send(PagesGenerator) DEBUG: Successfuly imported extension module "markdown.extensions.codehilite". DEBUG: Successfully loaded extension "markdown.extensions.codehilite.CodeHiliteE xtension". DEBUG: Successfuly imported extension module "markdown.extensions.extra". DEBUG: Successfuly imported extension module "markdown.extensions.smart_strong".

DEBUG: Successfully loaded extension "markdown.extensions.smart_strong.SmartEmph asisExtension". DEBUG: Successfuly imported extension module "markdown.extensions.fenced_code". DEBUG: Successfully loaded extension "markdown.extensions.fenced_code.FencedCode Extension". DEBUG: Successfuly imported extension module "markdown.extensions.footnotes". DEBUG: Successfully loaded extension "markdown.extensions.footnotes.FootnoteExtension". DEBUG: Successfuly imported extension module "markdown.extensions.attr_list". DEBUG: Successfully loaded extension "markdown.extensions.attr_list.AttrListExtension". DEBUG: Successfuly imported extension module "markdown.extensions.def_list". DEBUG: Successfully loaded extension "markdown.extensions.def_list.DefListExtens ion". DEBUG: Successfuly imported extension module "markdown.extensions.tables". DEBUG: Successfully loaded extension "markdown.extensions.tables.TableExtension" . DEBUG: Successfuly imported extension module "markdown.extensions.abbr". DEBUG: Successfully loaded extension "markdown.extensions.abbr.AbbrExtension". DEBUG: Successfully loaded extension "markdown.extensions.extra.ExtraExtension".

DEBUG: Successfuly imported extension module "markdown.extensions.meta". DEBUG: Successfully loaded extension "markdown.extensions.meta.MetaExtension". DEBUG: Signal page_generator_context.send(PagesGenerator, ) DEBUG: Read file pages\contact.md -> Page DEBUG: Read file pages\content.md -> Page DEBUG: Read file pages\hadoop.md -> Page ERROR: Could not process pages\hadoop.md | 'utf8' codec can't decode byte 0xd6 in position 127: invalid continuation by te |___ | Traceback (most recent call last): | File "D:\Python27\lib\site-packages\pelican-3.5.0-py2.7.egg\pelican\generators.py", line 629, in generate_context | context_sender=self) | File "D:\Python27\lib\site-packages\pelican-3.5.0-py2.7.egg\pelican\readers.py", line 459, in read_file | content, reader_metadata = reader.read(path) | File "D:\Python27\lib\site-packages\pelican-3.5.0-py2.7.egg\pelican\reader s.py", line 237, in read | with pelican_open(source_path) as text: | File "D:\Python27\lib\contextlib.py", line 17, in enter | return self.gen.next() | File "D:\Python27\lib\site-packages\pelican-3.5.0-py2.7.egg\pelican\utils. py", line 237, in pelican_open | content = infile.read() | File "D:\Python27\lib\codecs.py", line 668, in read | return self.reader.read(size) | File "D:\Python27\lib\codecs.py", line 474, in read | newchars, decodedbytes = self.decode(data, self.errors) | UnicodeDecodeError: 'utf8' codec can't decode byte 0xd6 in position 127: inv alid continuation byte DEBUG: Read file images\favicon.png -> Static DEBUG: Signal static_generator_preread.send(StaticGenerator) DEBUG: Signal static_generator_context.send(StaticGenerator, ) DEBUG: Read file images\weibo.png -> Static DEBUG: Read file images\weixin_pub.png -> Static -> Writing E:\dev\opensource\myblog\blog\output\feeds/all.atom.xml -> Writing E:\dev\opensource\myblog\blog\output\index.html -> Writing E:\dev\opensource\myblog\blog\output\tags.html -> Writing E:\dev\opensource\myblog\blog\output\categories.html -> Writing E:\dev\opensource\myblog\blog\output\authors.html -> Writing E:\dev\opensource\myblog\blog\output\archives.html -> Writing E:\dev\opensource\myblog\blog\output\pages\about-me.html -> Writing E:\dev\opensource\myblog\blog\output\pages\contact.html -> Writing E:\dev\opensource\myblog\blog\output\pages\blog-content.html WARNING: Skipped copy E:\dev\opensource\myblog\blog\pelican-bootstrap3\images to E:\dev\opensource\myblog\blog\output\theme -> Copying E:\dev\opensource\myblog\blog\content\images\favicon.png to images/fa vicon.png -> Copying E:\dev\opensource\myblog\blog\content\images\weibo.png to images/weib o.png -> Copying E:\dev\opensource\myblog\blog\content\images\weixin_pub.png to images /weixin_pub.png


The hadoop.md content

Title: hadoop Date: 2014-12-28 11:20 Modified: 2014-12-28 18:30 Authors: Liu Keyou Summary: hadoop lang: 'chs'

test u'中文'


The pelicanconf.py content

LOCALE = ('chs') TIMEZONE = 'Asia/Shanghai' DEFAULT_LANG = 'chs' DATE_FORMAT={'zh':('zh_CN','%Y-%m-%d,%a'),}

liukeyou commented 9 years ago

I resolve this problem. Because the coding page of error file (hadoop.md) is GBK, pelican assume the coding page is utf-8, so must change the coding page from GBK to utf-8 (can use ultraedit).