Open fridgecow opened 3 years ago
From bs4 docs, https://www.crummy.com/software/BeautifulSoup/bs4/doc/#pretty-printing, prettify() is for debugging purposes only.
Since it adds whitespace (in the form of newlines), prettify() changes the meaning of an HTML document and should not be used to reformat one. The goal of prettify() is to help you visually understand the structure of the documents you work with.
https://github.com/search?q=repo%3Awcember%2Fpypub%20%20%20prettify&type=code
During chapter loads,
xmlprettify
is called to format the output nicely, and in doing so it strips thetext
andtail
attributes from elements. Unfortunately this can have the unintended consequence of producing mangled epubs from reasonable HTML.For example, this HTML:
should produce output exactly like the input,
but actually looks like:
Which will be rendered differently since there's no space after the 5.
Removing the xmlprettify call from
Chapter._render
makes the output correct again.