SchumacherFM / wordpress-to-hugo-exporter

Hugo is static site generator written in golang. Wordpress is a tool for remote access to your server ;-) ❗️Contributions welcome!
https://gohugo.io
GNU General Public License v3.0
692 stars 95 forks source link

high utf8 in title breaks things #50

Closed guycalledseven closed 6 years ago

guycalledseven commented 6 years ago

I played with exporter the other night and noticed it did not play well with high utf characters.

I had issues with non printable chars (e.g. 
), double quotes ("), single quotes (' and ').. in my titles, causing invalid .md files generation, breaking hugo, looking bad (all titles on site contained codes, not characters) etc.

Eg, one blog post had title with – in it (–). Front matter looked like this:

title: 'Top Gear 03×13 – BBC camera and editing crew are gods'
author: Daemon
type: post
date: 2009-07-06T08:35:56+00:00
url: /2009/07/06/top-gear-03x13-bbc-camera-and-editing-crew-are-gods/
views:
  - 129
categories:
  - developers journal

This one was causing hugo to break with yaml: invalid trailing UTF-8 octet

Started building sites ...
ERROR 2017/11/10 00:00:59 failed to parse page metadata for "posts/2009-07-06-top-gear-03x13-bbc-camera-and-editing-crew-are-gods.md": yaml: invalid trailing UTF-8 octet
Error: Error building site: Errors reading pages: Error: failed to parse page metadata for "posts/2009-07-06-top-gear-03x13-bbc-camera-and-editing-crew-are-gods.md": yaml: invalid trailing UTF-8 octet for 2009-07-06-top-gear-03x13-bbc-camera-and-editing-crew-are-gods.md

Other high ut8 chars just looked like codes in titles in generated code.

I created PR #49

Thanks for great work! Best!

SchumacherFM commented 6 years ago

👏👏👏

Merged! Thank you very much!

guycalledseven commented 6 years ago

👍