Empty italic tag in clues causes rest of clue to be italicized

jpd236 commented 2 years ago

The following XML in a JPZ clue:

<span>Part of clue<i/>more of the clue</span>

Causes "more of the clue" to be italicized even though it's not actually enclosed in the italics. This also seems to happen with a regular empty tag instead of a self-closing tag, and also if there's whitespace inside the tag; there has to be some non-blank character for the parsing to work correctly, AFAICT.

jpd236 commented 2 years ago

I can no longer reproduce with a regular empty tag, only a self-closing tag, at least with ipuz (was testing with jpz before)
It still reproduces if I use wxHtmlWinParser in HtmlClueListBox::CacheItem to parse the HTML, which I think is the component responsible for parsing the HTML before rendering it
AFAICT,  is invalid HTML in that only certain tags are permitted to be self-closing. It's also not really the kind of thing you'd generally expect to see, though I did observe it once (not sure whether it was in the original source data or if I introduced it when converting from another format to JPZ). But the failure mode here of just ignoring the closing tag doesn't seem great.

Probably not a huge priority in the grand scheme of things, but I guess the next step here would be to try to reproduce this with a smaller sample app and pass the report along to wxWidgets.

EDIT: I originally posted this without escaping the  above, and, funnily enough, the rest of the comment showed up in italics! Maybe this is actually how HTML parsers are supposed to handle this...

mrichards42 commented 2 years ago

Hmm . . . looking through what I think is the jpz schema, it seems like clue text is actually XML, not a string of html? In which case  would in fact be a self-closing tag :) It looks like the spec allows     and  children in clue text.

So . . . maybe this needs to be handled in the jpz parser? We could convert self-closing tags to the equivalent empty tag, or perhaps just remove them entirely since that should render the same way.

jpd236 commented 2 years ago

I realized that I filed this as part of investigating the clue mentioned in https://github.com/jpd236/kotwords/issues/24, and indeed that specific clue is still a working repro case where the italic tag is non-empty (and thus not self-closing). So it does seem like there's more to this.

Attached a sample JPZ where the clue for 1-Across is:

<span>First across clue with</span><i> </i><span>italicized space</span>

This renders correctly in Crossword Solver, but in XWord, the space is omitted, and "italicized space" is in italics.

test.zip

mrichards42 / xword

Empty italic tag in clues causes rest of clue to be italicized #171