mediacloud / story-indexer

The core pipeline used to ingest online news stories in the Media Cloud archive.
https://mediacloud.org
Apache License 2.0
2 stars 5 forks source link

More queue fetcher fixes/improvements #272

Closed philbudne closed 7 months ago

philbudne commented 7 months ago

Fixes to address issues found in latest A/B comparison runs with scrapy/batch fetcher:

created indexer/requests_arcana.py:

worker.py, storyapp.py: create State Enum for orderly shutdown of Pika thread to avoid story loss!

tqfetcher.py, sched.py:

philbudne commented 7 months ago

I've pushed updates to address these