thisisparker / xword-dl

⬛⬜⬛ Command line tool to scrape crosswords from online solvers and save them as .puz files ⬛⬜⬛
MIT License
139 stars 30 forks source link

Sanitized clues erroneously end in new lines #40

Closed thisisparker closed 2 years ago

thisisparker commented 2 years ago

Thanks to some sleuthing over in https://github.com/jpnance/xw/issues/3, I've learned that html2text() appends a newline to the end of the strings it processes. That behavior probably makes enough sense when you're using it on full documents, but it's a little weird for short strings like crossword clues.

I currently use html2text to clean up AmuseLabs puzzles and WSJ puzzles. Easy enough to strip out, and that will probably result in more "standard" puz files.

thisisparker commented 2 years ago

This is resolved in the latest release! (We'd been a long time without a release!)