Closed pavelchristof closed 8 years ago
I noticed the other pull request and I guess this could be caused by some difference between the encoding on python 2 & 3, e.g. parser.body contains these newlines because wrapped_content also has them. So while this fixes the main document, the summary will probably still be broken.
EDIT: The summary is broken.
I've changed the patch to feed the parser piece by piece and avoid using encode(). It appears to work on python 2.7 & 3.5, with both summaries and main content. I'm not sure if it might throw some random Unicode exceptions (it doesn't in my case), you never know with python.
Create a new variable 'wrapped_content' instead of modifying 'content'. When running on python3 'parser.body' is not the same as the original 'content'. It includes escaped newlines ("\n" instead of "\n"), which completely break the output.