When CVS has to ask for information, ask for everything it needs in a row, instead of asking-scraping-asking

As an improvement on Issue 161 (delay asking for information till the end of 
the batch), I'd like to suggest that CVS, when it detects multiple series that 
need user intervention, at the end of the batch (maintaining issue 161 
behaviour, which is great), ask for all the series it needs information on, and 
only then it proceeds to scrape the issues of those series.

Right now it works like so:
1)CVS detects an unscraped comic
2)Asks the user for right series
3)Scrapes all books of that series
4)start over

and i would like to change that to:
1)CVS detects an unscraped comic
2)Asks the user for right series
3)start over until all series are identified
4)scrape all un-scraped comics, for which CVS now knows the series

This has the added benefit that, in large batches, after re-scraping 
everything, or even at the start, CVS would get all the user information it 
needs, instead of huge delays between possibly hundreds of issue scrapes. (yes, 
i'm adding very large batches at the moment :-) ).

Original issue reported on code.google.com by joaomigu...@gmail.com on 9 Oct 2011 at 8:02

Thanks for the suggestion.   I was sure someone asked for something like this 
before, but looking around, I can't find the reference to it.

What you are suggesting basically amounts to front-loading most or all of the 
user's interaction with the scraper, so that users who are familiar with what's 
going to happen can just quickly answer a bunch of questions, and then walk 
away while the scraper does it's thing.

I can appreciate the benefits of the idea, but there are two major drawbacks 
that make it unlikely that I'll do this anytime soon:

1) it would be very confusing for casual and first time users.   That current 
simple, basic loop (ask-scrape-ask-scrape-ask-scrape) is very intuitive and 
easy to understand--you always know what's being scraped, and if you cancel or 
an error occurs, you've already got a good idea of what's been done, and what 
hasn't.

2) it would require throwing away and rewriting the most difficult and 
carefully written parts of the comic vine scraper--the entire application is 
built on the idea of a "processing loop", and doing something like this would 
be an very large amount of work, because it would require rewriting that loop, 
or even dropping it altogether.

If I were starting the scraper from scratch, I'm not sure that a processing 
loop is the way I'd go--there's probably other ways let the user work their way 
through their collection that are less rigid (and less difficult to code)...but 
I'm not planning to start over anytime soon, so your suggestion will have to 
stay on the back burner for now.

Original comment by cban...@gmail.com on 9 Oct 2011 at 8:48

Added labels: Priority-Low
Removed labels: Priority-Medium

cbanack / comic-vine-scraper

When CVS has to ask for information, ask for everything it needs in a row, instead of asking-scraping-asking #211