DreamCobbler / fiction-dl

A content downloader, capable of retrieving works of (fan)fiction from the web and saving them in a few common file formats.
GNU General Public License v3.0
46 stars 2 forks source link

minor issue with spaces with ff.net #2

Closed betsybugaboo closed 3 years ago

betsybugaboo commented 3 years ago

Something I've noticed is some older fic (pre ~2012 or so) has an issue, where, likely due to the older formatting of the fic, words become mashed together likethis. From what I can see, it's due to the original's site formatting in the html, where basically almost arbitrarily lines break mid-sentence rather than text (but with no

or anything). Due to this, the text is pulled together without a space. An example can be seen here; the html arbitrarily cuts itself in half. The resulting downloads have the words on the ends of the lines smashed together. Is there any way to remedy this? I've tried tweaking the source, but I only have incredibly rudimentary coding skills so there isn't a ton I can really do.

betsybugaboo commented 3 years ago

Solved this! Simply open SanitizerProcessor.py in notepad, ctrl-f for newline, then add a space in the content replacement like so: for newline in Newlines: content = content.replace(newline, " ") Save and voila! the issue is gone, and it does not affect fic that did not already have the issue.