jnordberg / wintersmith

A flexible static site generator
http://wintersmith.io/
MIT License
3.5k stars 332 forks source link

Metadata not recognized when file starts with BOM #306

Open DanielSundberg opened 8 years ago

DanielSundberg commented 8 years ago

When a content file starts with a UTF8 BOM I don't get any metadata which can result in hard-to-find bugs. We're using Wintersmith to generate a docs site and we allow multiple users to add content. It is very easy to unintentionally save a file with wrong BOM configuration.

To reproduce:

Create a Wintersmith site:


C:\Code\> wintersmith new .\test-site
initializing new wintersmith site in C:\Code\test-site using template blog
C:\Code\test-site
+-- moment@2.3.1
+-- typogr@0.5.2
`-- underscore@1.4.4
done!
preinstall:. -> lifecycle / |#########################################################################--------|

Run wintersmith preview and navigate to the index page and check that the readme article is present:

2016-03-29 15_37_40-clipboard

Next open contents\articles\hello-world\index.md using an editor that can add a BOM char at the beginning of the file. In Notepad++ select Encoding/Convert to UTF-8 and save the file.

Refresh the preview and notice that the README link is missing:

2016-03-29 15_39_57-the wintersmith s blog

The root cause is that the metadata section is not detected properly since you're reading the first 3 chars of the content looking for "---" (markdown.coffee:123) which in this case is the BOM char.

The naive approach would be to use something like stripBOM() in https://github.com/jonschlinkert/fs-utils, but I don't know if that's enough for all cases?

jnordberg commented 8 years ago

I don't know either, didn't know BOM was a thing even :) We could modify the regex to allow it, maybe \s matches those chars by default.