jpd236 / CrosswordScraper

Browser extension which downloads crosswords from crossword applets for offline solving.
Apache License 2.0
28 stars 1 forks source link

PDF layout options? #6

Closed arelkin closed 2 years ago

arelkin commented 2 years ago

https://www.vox.com/21523212/crossword-puzzles-free-daily-printable (See January 18, 2022 puzzle: by Juliana Tringali Golden)

  1. the blue shading is not reflected in the PDF. (in .puz, they are reflected as circles; in .jpz, reflected properly as blue shading)

  2. the grid is bottom aligned. (seems like it should be top aligned)

JulianaTringaliGolden.pdf

Feature request for PDF: allow user to choose font. Or at least give a few more font options. I usually find it easier to read sans-serif fonts.

jpd236 commented 2 years ago

Thanks for the report!

The shading is actually there - it is just very, very subtle. There is a setting for ink saver that should default to being off, but which you appear to have enabled; the problem is that this shade of blue is so light that any amount of further lightening makes it very hard to see. Dropping the percentage to 25% or lower makes it more visible. Unfortunately, it's a bit hard to pick a setting here that works well across the board. I commonly see crosswords which use very dark/bright colors (e.g. https://spyscape.com/crosswords/puzzle-148-up-to-speed, which looks good even at 75% Ink Saver), so applying the ink saver setting to the custom colors seems appropriate in general, but won't work well for every situation. I can leave this open and think about whether some sort of minimum contrast ratio could make things better.

The grid layout is just hard-coded to go in the bottom right for simplicity - again, hard to come up with a scheme that works for all grid sizes and puzzle types. Adding a setting to put it in other corners by default would probably be doable. Trying to detect a "best" corner for each puzzle would be tougher.

Settings/controls for font options makes sense. One drawback is that the fonts have to be bundled with the extension, or else special characters won't be supported (and I don't think there's a way to access system fonts, though it would probably be possible to support uploading a custom font). But a sans-serif option would probably be reasonable, at least.

arelkin commented 2 years ago

Thanks for the quick response.

Yes, you're right, I did have the ink saver set rather high. Sorry about missing that.

Re, fonts: would it be possible to, at least, include a generic sans-serif: Arial, or some similar font?

Also, any plans to scrape Crossword Compiler grids? https://www.washingtonexaminer.com/crossword-grounded-flights https://crosswordsbackwards.com/backward-crossword-puzzle-380/

think about whether some sort of minimum contrast ratio could make things better.

Yes, I guess if the constructor sets very light colors, you could darken them a little if they go past some lightness threshold.

jpd236 commented 2 years ago

Added a setting to select a sans-serif font for PDFs, and support for Crossword Compiler applets. For the latter, one caveat is that these are hosted directly by the site's host, rather than a central service, so it's impossible to know up front if an embedded frame contains such a puzzle or not just from the URL. Since we don't want to request permission to view every page as this is scary to some users, the scraper only supports sites which embed the applet directly on the current page, or specifically listed sites known to contain iframes. The two sites you linked, as well as another I'm aware of (www.brendanemmettquigley.com), should all work, but let me know if you're aware of any more.

Otherwise, I filed a separate bug to track selection of the grid corner when rendering PDFs at https://github.com/jpd236/kotwords/issues/27, and for dealing with light cell background colors in ink saver mode at https://github.com/jpd236/kotwords/issues/28. These are more significant, though, so I probably won't get to them soon unless there's more demand.

I think that's everything here, but feel free to reopen / file a separate request for anything I missed. Will make a new release some time soon unless I find other issues.