cbanack / comic-vine-scraper

An add-on script for ComicRack that lets you copy details from Comic Vine into your comic books.
243 stars 47 forks source link

automatic issue picking fails most times. #459

Closed CuddleBear92 closed 5 years ago

CuddleBear92 commented 5 years ago

Im having some issues with the automatic issue picking when scraping for a series as a whole. All the series titles are shared the same, the years is all correct on release and issue numbers is all correct yet it fails most times to continue with the picked series/volume.

For example i had Detective Comics V2016 #939-#987 in single issues yet it fails to re-use the correct series i pick the first time the scraper asked.

This happens for all issues on such a scrape and forces me to pick the series for each issue individually.

This also happens to me when there is no other series with the shared name like Detective Comics. Even an unique name that only has one single series hit on its query... it still requires me to confirm the series each time.

So far i have tried re-installing, un-installing and installing a new of the plugin with the newest version v1.0.96 I have also tried installing a fresh CR client with all local data so its contained. I have tried clearing out the appdata folders of comicrack/comicvine scraper too between plugin installs and comicrack installs.

Feel lost atm as sometimes it works. but its on the really rare side. And yes i did try both the automatic picking options in the options. though normally i have both off as picking the series is important while issues are already sorted.

the only other thing im doing is this ignore list: https://github.com/vaemendis/ubooquity-doc/blob/master/pages/tutorials/add-metadata-with-comicrack_scraper-filter.md

but the outcome is the same with or without it anyway it seems...

EDIT: checked the script output, doesnt seem to be any out of the ordinary in the output. all the settings is correct with.

i can pick the first series fine, but it seems it just displays the choose series window again no matter what.


--------------------------------------------------------------------
[X] Series          [X] Volume          [X] Number          
[X] Title           [X] Published       [X] Released        
[X] Crossovers      [X] Publisher       [X] Imprint         
[X] Writer          [X] Penciller       [X] Inker           
[X] Colorist        [X] Letterer        [X] Cover Art       
[X] Editor          [X] Summary         [X] Characters      
[X] Teams           [X] Locations       [X] Webpage         
-------------------------------------------------------------------
[X] Overwrite Existing        [X] Ignore Blanks             
[X] Convert Imprints          [ ] Autochoose Series         
[X] Download Thumbs           [X] Preserve Thumbs           
[ ] Confirm Issues            [X] Rescraping: Notes         
[X] Fast Rescrape             [X] Rescraping: Tags          
[X] Summary Dialog            
-------------------------------------------------------------------

======> scraping next comic book: 'Detective Comics 980(2018)(2 covers)(Digital)(TLK-EMPIRE-HD).cbz'
searching for series that match 'Detective Comics'...
...filtered out 16 (of 61) results.
...found 45 results
displaying the series selection dialog...
   ...user chose to SCRAPE using: 'Detective Comics (91098)'
searching for the right issue in 'Detective Comics (91098)'
   ...identified issue number 980
querying comicvine for issue details...
setting values for this comic book ('*' = changed):
-->  Series         : Detective Comics
-->  Issue Number   : 980
--> *Title          : Batmen Eternal Part 5
--> *Crossovers     : Batmen Eternal
--> *Summary        : “Batmen Eternal” part five! The worst possible future for Gotham City  ...
--> *Release Date   : 2018-5-9
--> *Publish Date   : 2018-7-1
--> *Volume         : 2016
-->  Imprint        : --- skipped ---
--> *Publisher      : DC Comics
--> *Characters     : Barbara Gordon, Batman, Batwoman, Brother EYE, Cassandra Cain, General ...
--> *Teams          : Gotham City Police Department, O.M.A.C.s
--> *Locations      : Gotham City
--> *Writers        : James Tynion IV
--> *Pencillers     : Scot Eaton
--> *Inkers         : Wayne Faucher
--> *Colorists      : Allen Passalaqua, John Kalisz
--> *Letterers      : Sal Cipriano
--> *CoverArtists   : Alvaro Martinez, Brad Anderson, Rafael Albuquerque, Raul Fernandez
--> *Editors        : Chris Conroy, Dave Wielgosz, Jamie S. Rich
--> *Webpage        : https://comicvine.gamespot.com/detective-comics-980-batmen-eternal-par ...
-->  Rating         : --- skipped ---
--> *Tags           : CVDB669428
--> *Notes          : Scraped metadata from ComicVine [CVDB669428].
--> *Issue Key      : 669428
--> *Series Key     : 91098
-->  Cover Art URL  : https://comicvine.gamespot.com/api/image/scale_small/6419462-980.jpg

======> scraping next comic book: 'Detective Comics 969 (2018) (2 covers) (digital) (Minutemen-Slayer).cbz'
searching for series that match 'Detective Comics'...
...filtered out 16 (of 61) results.
...found 45 results
displaying the series selection dialog...
cbanack commented 5 years ago

Maybe you have each issue saved in its own individual folder? Unless you explicitly tell it not to, the scraper always considers a comic to (potentially) belong to a new series if it is in a new folder. In other words, it will always ask you to re-confirm the series when it starts processing a new folder.

Also, I see you have turned off the option for it to pick the series automatically. It's been a while since I ran the scraper with that option turned off, but I believe that this could cause the behaviour you're seeing too -- it considers re-using your series choice for other, similarly named comics (in the same folder) to be part of "choosing the series automatically".

So maybe try turning the option back on and see if the problem goes away. You don't need to worry that the scraper will automatically pick the wrong series -- it pretty much never has a false match, because it does a quick image comparison of the cover art of your comic before it accepts any automatic series choice. If the images don't match, it will still force you to choose the series manually.

CuddleBear92 commented 5 years ago

just confirmed. yeah its the individual folders. wouldn't think that would matter as CR pulls it all from the filenames and sorts it that way. i guess i have to work around it then. move out of all the individual weekly releases into a merged folder. thanks for the help! much easier now!

yeah autochoose series will still reuse the series for the other issues, which allows the users to confirm the series volume before scraping. like having two or three different volumes of the same series name.

so it does actually do image compare? thats great! rather be safe though. have had false scrapes in the past at times with TPB's vs full series and such.

cbanack commented 5 years ago

Ah good I'm glad you got to the bottom of it.

The scraper uses the folders as a hint for a new series; many people sort their comics into folders by series, but they can end up with three different series, all called "avengers", where every issue is called "Avengers #XX.cbz". The fact that they are split into separate folders ("Avengers Vol 1", "Avengers Vol 2", "Avengers Vol 3") will at least give the scraper a clue that the files are not all from the same series.

Anyway, there is an advanced setting to turn this behaviour off, which would probably help solve your original issue: https://github.com/cbanack/comic-vine-scraper/wiki/Advanced-Settings#ignore_folders

Yup, it does image comparison -- I got ambitious one day and wrote the algorithm myself. Works well, if I do say so. But TPB's are the one case where things can go wrong: you have a TBP issue that has the same name, issue number and cover art as the first issue in the series. Not much the scraper can do to protect you there. :/

CuddleBear92 commented 5 years ago

yeah the TBP issue is pretty much the main reason i still go by folder by folder and confirm each series.

i was mostly stuck with lose folders of weekly releases which ofc only have an issue or two of each series in each.

while image recog is in. an paranoia mode to always check the cover image on every issue instead of just the first one would be great unless its doing that.

that would remove the confusion of the different versions of the same series name as you noted with v1 v2 of Avengers for example.