gjtorikian / html-pipeline

HTML processing filters and utilities
MIT License
2.27k stars 382 forks source link

Add 'force_encoding' option to MarkdownFilter #333

Closed ChrisBAshton closed 4 years ago

ChrisBAshton commented 4 years ago

Hello 👋

This PR is to fix an open issue with a dependant project: https://github.com/alphagov/govuk-developer-docs/issues/1821

HTML::Pipeline uses CommonMarker as its markdown parser in the MarkdownFilter. CommonMarker throws an exception if it encounters any non-UTF-8 content: https://github.com/gjtorikian/commonmarker/pull/10/files#diff-5850b052ec7a5e6bd8b51bc53465146aR8

We would like to be able to pass a force_encoding: "UTF-8" parameter to the context, which can be used to force encoding prior to calling CommonMarker, so that the exception is avoided.

Thank you for considering our pull request.

ChrisBAshton commented 4 years ago

Please disregard this PR - we've decided to to the force_encoding further upstream, so that it is already forced UTF-8 by the time we call HTML::Pipeline.