Tenma-Server / Tenma

Comic book server with in-browser reader
MIT License
252 stars 31 forks source link

Comic lookup for titles with umlauts characters give wrong results #52

Open bpepple opened 7 years ago

bpepple commented 7 years ago

Attempting to import an issue (IDW Ragnarök #10) with an umlaut character gives extremely faulty results. Each of the 12 issue gave a different series, and none of them the correct one.

Here's a snippet of a few of the results:

[2017-10-29 20:04:49,653: INFO/Worker-1] "Ragnarök #010 (2016).cbz" was matched on Comic Vine as "Barakamon - #10" (526965)
[2017-10-29 20:05:24,234: INFO/Worker-1] "files/Comics/IDW Publishing/Ragnarök/Ragnarök #010 (2016).cbz" was processed successfully as "Barakamon - #10" (526965)
[2017-10-29 20:08:23,539: INFO/Worker-1] "Ragnarök #008 (2016).cbz" was matched on Comic Vine as "Fuuka - #8" (512237)
[2017-10-29 20:08:48,898: INFO/Worker-1] "files/Comics/IDW Publishing/Ragnarök/Ragnarök #008 (2016).cbz" was processed successfully as "Fuuka - #8" (512237)
[2017-10-29 20:11:42,989: INFO/Worker-1] "Ragnarök #001 (2014).cbz" was matched on Comic Vine as "Netoraserare - #1" (496584)
[2017-10-29 20:11:58,046: INFO/Worker-1] "files/Comics/IDW Publishing/Ragnarök/Ragnarök #001 (2014).cbz" was processed successfully as "Netoraserare - #1" (496584)
[2017-10-29 20:14:56,261: INFO/Worker-1] "Ragnarök #011 (2016).cbz" was matched on Comic Vine as "Amanchu! - #11" (571611)
[2017-10-29 20:15:19,101: INFO/Worker-1] "files/Comics/IDW Publishing/Ragnarök/Ragnarök #011 (2016).cbz" was processed successfully as "Amanchu! - #11" (571611)

Off topic: I'd also add my vote to implementing support of reading comicinfo.xml files instead of hammering comicvines site. I've already implemented it on a django project I started awhile back, and if I get a few free cycles I'm planning on adding it to a fork. On a large collection (6,000+) depending on the comic vine api to scrap all the info will take over 48+ hours.

hmhrex commented 7 years ago

Thanks for documenting this one. I'll be looking into a solution for special characters.

As for the off-topic... Comic Vine takes such a long time because of their request limits. I would love to see a fork with comicinfo.xml support!