commoncrawl / news-crawl

News crawling with StormCrawler - stores content as WARC
Apache License 2.0
323 stars 35 forks source link

Route tuples to the status updater bolt based on URLs #65

Closed jnioche closed 11 months ago

jnioche commented 11 months ago

See #63

When using more than one instance of the status updater bolt, the tuples need to be routed to the instance based on the URL so that the caching can be leveraged.