More queue fetcher fixes/improvements

Fixes to address issues found in latest A/B comparison runs with scrapy/batch fetcher:

created indexer/requests_arcana.py:

worker.py, storyapp.py: create State Enum for orderly shutdown of Pika thread to avoid story loss!

tqfetcher.py, sched.py:

raise minimum interval to 5 seconds
deal with (russian?!) sites that don't return Content-Type header!!
add ConnStatus.THROTTLE to respond to HTTP 429 status
add FinishRet for more/better logging of slot state (to debug/test throttling)
add tunables: --initial-interval-seconds --initial-interval-seconds
default number of worker threads with goal of keeping 2/3 of all cores busy
add max_delayed_per_slot parameter (1/4 of prefetch) so that one site can't gum up processing
requests.Session is a context handler, so use with statement
improve request average handling to be closer to scrapy
add comments
return & log num_delayed from Scoreboard.get_delay

mediacloud / story-indexer