commoncrawl / news-crawl

News crawling with StormCrawler - stores content as WARC
Apache License 2.0
323 stars 35 forks source link

Port topology and resources to StormCrawler 3.1 or upwards #60

Open jnioche opened 1 year ago

jnioche commented 1 year ago

Upgrade Apache Storm, ElasticSearch and Kibana

This way the NewsCrawler will benefit from the many bugfixes and improvements provided by these components and make it easier ti add new functionalities going forward.

alextechnology commented 11 months ago

Hello - I posted a comment in Discussions regarding 2.x not working with multiple topology workers

jnioche commented 11 months ago

Hello - I posted a comment in Discussions regarding 2.x not working with multiple topology workers

thanks @alextechnology I'll have a look early next week

sebastian-nagel commented 1 month ago

Development ongoing in branch 2.x