suryakencana007 / comic-vine-scraper

Automatically exported from code.google.com/p/comic-vine-scraper
0 stars 0 forks source link

Comics from the series '2000AD' not scraping properly #128

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
DESCRIBE THE PROBLEM:

When scraping the 2000AD comics, ComicVineScraper can never properly evaluate 
which issue to scrape.  

WHAT STEPS WILL REPRODUCE THIS PROBLEM? (Please include the exact name of
the eComic book that you were trying to scrape, if possible.)
1. Scrape a comic with the name of "2000AD 1542 (20-06-07).cbr"

WHAT VERSION OF COMICVINESCRAPER ARE YOU USING?
1.0.27

PLEASE PROVIDE ANY ADDITIONAL INFORMATION THAT MAY BE OF USE
The results of the above scrape will always result in the following:

1. "Couldn't find any comics that match the search terms: '2000AD 1542'"
2. After correcting the name to just be 2000AD it will automatically pull in 
issue #7 (from the last digit in the date of the file name).

It's easy to work around this by doing one issue at a time and specifically 
choosing which issue you want to scrape, however I'm looking at ~200 issues 
right now that is making a 'one-at-a-time' solution very tedious.  If it's not 
an easy fix or not seen as a problem I'll be happy to continue the 
'one-at-a-time' scraping as this is the only title I have ever had a problem 
with.

Original issue reported on code.google.com by yohn....@gmail.com on 30 Aug 2010 at 7:38

GoogleCodeExporter commented 9 years ago
Thanks for the bug report.

I just tried it and I get the same behaviour.

It looks like the scraper ends up thinking that the issue number ('1542') is 
part of the series name ('2000AD').  This is extra annoying because that means 
the scraper behaves as though you are scraping 200 different series, too 
('2000AD 1542', '2000AD 1543', etc.)

----------------------

If you look at the default values in the info page for any one of your 
*unscraped* 2000AD comics, you'll be able to see what values ComicRack 
automatically parses out of the file name (i.e without the help of the 
scraper).   

Two of the most important values (series name and issue number) are messed up 
in the way I just described--so this problem begins with the fact that that 
ComicRack has incorrectly parsed those values out of the comic book's file name.

That being the case, would you be willing to report this problem in the 
comicrack bug forum?

http://comicrack.cyolito.com/user-forum/10-bugs

There's a very good chance that cYo will want to fix the problem in ComicRack, 
and if he does, you'll find that the scraper suddenly starts working properly 
for these files.

And if you report the bug and cYo refuses to fix the problem, then I will look 
into creating a workaround in Comic Vine Scraper so that the scraper works 
properly for 2000 AD.

Original comment by cban...@gmail.com on 30 Aug 2010 at 8:35

GoogleCodeExporter commented 9 years ago
I see you posted the bug to the ComicRack forum, thanks!

http://comicrack.cyolito.com/user-forum/10-bugs/9683-filename-parsing-error

We'll see what cYo has to say.

Original comment by cban...@gmail.com on 2 Sep 2010 at 12:46

GoogleCodeExporter commented 9 years ago
I notice cYo hasn't answered you bug posting yet.  That doesn't mean he's not 
gonna fix it--he often doesn't respond to bug reports.

I looked at the possibility of building a workaround for you into the scraper, 
but it's actually not very easy to do.  Otherwise I would have added a fix in 
for the upcoming 1.0.28 release.

For now, you could try 'mass renaming' your files with a tool like this one:

http://www.snapfiles.com/get/renamemaster.html

That should solve your immediate problem, though it stil doesn't get the 2000AD 
series working properly in ComicRack/ComicVineScraper.

Original comment by cban...@gmail.com on 5 Sep 2010 at 8:37

GoogleCodeExporter commented 9 years ago
That worked like a charm, once renamed they were all in the same series and had 
the correct values for the issue numbers.

Thanks!

Original comment by yohn....@gmail.com on 8 Sep 2010 at 2:15

GoogleCodeExporter commented 9 years ago
I've fixed the Scraper (version 1.0.33, coming soon) to nicely handle the 
common filenaming format for the "2000 AD" series, as described at the 
beginning of this bug report.

If it works properly, most people won't even notice it--their 2000 AD comics 
will just start scraping properly now.

Original comment by cban...@gmail.com on 1 Jan 2011 at 10:02