JimmXinu / FanFicFare

FanFicFare is a tool for making eBooks from stories on fanfiction and other web sites.
Other
753 stars 161 forks source link

Using prettify() changes whitespace. #11

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Generate EPUB output, e.g., "python downaloder.py 
http://www.fanfiction.net/s/5782108/1/ epub".
2. Examine the whitespace in, for instance, the early paragraphs of chapter 2.

What is the expected output? What do you see instead?
The original document contained the HTML '<p>"<i>Wingardium Leviosa.</i>"</p>'; 
this will be rendered in the output as '" Wingardium Leviosa. "', rather than 
'"Wingardium Leviosa."' (with italics, of course.) Spacing should be preserved.

What version of the product are you using? On what operating system?
I'm using current tip (26:54fc9b30ced5) on Python 2.6.4 (Ubuntu 9.10), plus the 
patches from issue 6, issue 7, issue 8, issue 9 and issue 10 (which don't 
affect this issue).

Please provide any additional information below.
Using the prettify() method causes whitespace to appear between tags. See the 
manual:

http://www.crummy.com/software/BeautifulSoup/documentation.html#Printing%20a%20D
ocument

"The prettify method adds strategic newlines and spacing to make the structure 
of the document obvious. It also strips out text nodes that contain only 
whitespace, which might change the meaning of an XML document. The str and 
unicode functions don't strip out text nodes that contain only whitespace, and 
they don't add any whitespace between nodes either."

It's the additional whitespace which is a concern here. It's not a validation 
problem, but it can and does make the output look a little odd when mysterious 
spaces appear between, for example, quote marks and their contents.

I believe this can be fixed by replacing "x.prettify()" with "str(x)" where it 
appears, but I'm not sure how many times it's required just yet. It makes the 
output somewhat more difficult to read, but this shouldn't be cause for 
changing the presrentation.

Original issue reported on code.google.com by adam.buc...@gmail.com on 20 Sep 2010 at 2:44

GoogleCodeExporter commented 9 years ago
You are correct--I found the same issue and removed the calls to prettify().  

However, ffnet likes to leave the text of each chapter entirely on one line.  
On nook, at least, that becomes a problem when the chapter length gets out past 
200kbytes.  Inserted newlines with each </p> & <br />.

Original comment by retiefj...@gmail.com on 16 Oct 2010 at 2:11