JimmXinu / FanFicFare

FanFicFare is a tool for making eBooks from stories on fanfiction and other web sites.
Other
757 stars 163 forks source link

Default settings strip empty <em> tags #1003

Closed KriPet closed 12 months ago

KriPet commented 12 months ago

Description

The following raw HTML (snipped):

<p>"Oh, <em>e</em><em>minently</em><em> </em>helpful, Theo.

is converted to

<p>"Oh, <em>e</em><em>minently</em>helpful, Theo. 

When downloaded using FanFicFare

The resulting epub removes the empty <em> </em> tag, rendering the text as

"Oh, eminentlyhelpful, Theo.

Of course the source text is very strangely formatted, but I would not expect FanFicFare to modify the HTML like this, changing the rendering.

To reproduce

Run fanficfare https://www.royalroad.com/fiction/28806/the-flower-that-bloomed-nowhere

The source HTML is here: https://www.royalroad.com/fiction/28806/the-flower-that-bloomed-nowhere/chapter/436845/003-mankinds-shining-future (Search for <em>minently in the source)

JimmXinu commented 12 months ago

Correct. That is the default setting. From defaults.ini:

## By default, empty tags are removed as part of cleaning up the
## source HTML.  However, a few tags should be kept even if empty.
## (Whitespace only, including &nbsp; is considered empty.)  This
## setting can adjust which tags are kept.
keep_empty_tags:p,td,th

You can change it by putting add_to_keep_empty_tags:,em under the site or story URL sections in personal.ini

If you are arguing that this should not be the default, you'll need to provide a more examples of it being an issue.

KriPet commented 12 months ago

Oh, that's perfect. Didn't realize that was an option.

I will not argue for it being the default. I've not seen this happen more than once.