jpd236 / kotwords

Collection of crossword puzzle file format converters and other utilities, written in Kotlin.
Apache License 2.0
25 stars 6 forks source link

Space lost inside formatting tag #24

Closed jpd236 closed 2 years ago

jpd236 commented 2 years ago

Today's New Yorker cryptic has the following clue:

<span>“Biography”<i> </i>interrupted by early diet-soda commercial (10)</span>

It's technically valid, though bizarre, to have an italicized space. However, the space is stripped by the XML parser we're using, so we lose it when converting to a Snippet before writing back to HTML.

Not sure if it's worth trying to handle this, but filing for tracking purposes.

jpd236 commented 2 years ago

Filed https://github.com/pdvrieze/xmlutil/issues/84 to see if there's any way to preserve this space when parsing.