mrichards42 / xword

Cross-platform crossword solving
https://mrichards42.github.io/xword/
GNU General Public License v3.0
42 stars 13 forks source link

'Wrong file type' error from PuzzleMe rawc format #191

Open fakelvis opened 2 years ago

fakelvis commented 2 years ago

Since the week of 16 May, the downloader has returned the 'Wrong file type' error when trying to grab the puzzles from The New Yorker and The Atlantic.

Appears to be related to the updated PuzzleMe obfuscated rawc format.

For reference, relevant commits for: kotwords, xword-dl.

jpd236 commented 2 years ago

Should be fairly straightforward to update https://github.com/mrichards42/xword/blob/master/scripts/import/amuselabs.lua#L107 to handle the same deobfuscation format. For now, at least... this may be a sign that PuzzleMe is trying to clamp down on these sorts of downloaders.

Of course, it's a bit repetitive to have to update all of these different importers any time things change in the applet. I wonder if I could compile Kotwords to a native library so I could use all the importers from there in XWord - would make things a bit easier to maintain. Not sure that'll be upstreamable, though.

mrichards42 commented 2 years ago

Thanks for the heads up, I've committed a fix (https://github.com/mrichards42/xword/commit/3aeebf8611823533651566294534c2bd1a21a197).

@jpd236 I'm certainly open to the idea of pulling in Kotwords for additional file types, although it sounds a bit hairy. How straightforward is it to compile kotlin to like a native shared lib? Last time I was doing much on the JVM, it seemed like graal was promising, but at that point it wasn't really ready for much beyond experimentation.

jpd236 commented 2 years ago

Kotwords uses Kotlin Multiplatform, so it can actually be compiled to a native shared library relatively easily as long as all of its dependencies target the needed platforms - the critical ones do - and if we can provide a native implementation for any platform-specific functionality. I was able to make a quick prototype that passes a decent chunk of unit tests, and in principle it can be exported as a native library and accessed from C (https://kotlinlang.org/docs/native-dynamic-libraries.html).

In practice... it still seems a bit rough around the edges. The PuzzleMe JSON parsing works, but the regex to extract the rawc from the HTML crashes due to the string length, and it looks like a long standing known issue with no signs of progress: https://youtrack.jetbrains.com/issue/KT-35508/EXCBADACCESScode2-address0x16d8dbff0-crashes-on-iOS-when-using-a-sequence-from-map-etc. I'm also a little worried about how easy it will be to access the Kotlin methods from C - while it sounds like the symbols are exported for any public method, there's usually some pain when dealing with suspending methods, and the Kotlin docs indicate that the export format is not guaranteed to be stable at this point.

So, maybe some longer-term potential, but I wouldn't want to rely on it just yet. It's been great though for targeting both the JVM and Javascript, though, so hopefully it gets better over time.