jpd236 / CrosswordScraper

Browser extension which downloads crosswords from crossword applets for offline solving.
Apache License 2.0
28 stars 1 forks source link

Italicized items in puz format #11

Closed arelkin closed 2 years ago

arelkin commented 2 years ago

Mac/Firefox Crossword Scraper Version: 1.2.5 Last Updated: January 30, 2022

https://www.mckinsey.com/featured-insights/the-mckinsey-crossword/march-1-2022

In puz format, italicized items are converted to quoted with old-fashioned quotes those clues then have a blank double return after them. 14-Across, 9-Down, 20-Down, 28-Down, 37-Down, 58-Down (see scan of physical printout from puz file)

Notice items that were quoted with fancy quotes, do not have the problem 8-Down

PeterGordon-GoldilocksZone

PeterGordon-GoldilocksZone.puz.zip

(I had to zip the puz, because Github wouldn't accept puz extension files.)

jpd236 commented 2 years ago

The raw data includes these newlines. This is one of the original clues:

<i>Working Girl</i><span> star Griffith\n\n</span>

I guess the applet is trimming them. We can probably do the same, although there could be some edge cases here (maybe newlines in the middle of a clue are legitimate?)

jpd236 commented 2 years ago

Will be fixed in the next release - thanks for the report!