istresearch / scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
http://scrapy-cluster.readthedocs.io/
MIT License
1.17k stars 323 forks source link

Can't up cluster from master branch #227

Closed MasterSergius closed 4 years ago

MasterSergius commented 4 years ago

After executing docker-compose up I've faced with several errors. Kafka container logs:

2019-09-19T13:58:04.022988200Z ERROR: No listener or advertised hostname configuration provided in environment.
2019-09-19T13:58:04.023037600Z        Please define KAFKA_LISTENERS / (deprecated) KAFKA_ADVERTISED_HOST_NAME

If I modify docker-compose.yml (add KAFKA_ADVERTISED_HOST_NAME): 49 KAFKA_ADVERTISED_HOST_NAME: localhost Kafka looks ok, but crawler is still unable to connect to it, logs from crawler container:

019-09-19T14:14:07.409019000Z 2019-09-19 14:14:07,408 [sc-crawler] INFO: Changed Public IP: None -> b'156.67.51.8'
2019-09-19T14:14:07.439196700Z 2019-09-19 14:14:07,438 [sc-crawler] ERROR: Could not ping Zookeeper
2019-09-19T14:14:07.439868500Z Unhandled error in Deferred:
2019-09-19T14:14:07.440134200Z 
2019-09-19T14:14:11.012381900Z 2019-09-19 14:14:11,012 [sc-crawler] ERROR: Unable to connect to Kafka in Pipeline, raising exit flag.
2019-09-19T14:14:11.012991200Z Unhandled error in Deferred:

And the most interesting part - everything is ok if up from dev branch!

madisonb commented 4 years ago

Hello - yes I would highly recommend the dev branch, the master branch contains the latest official release but all the new work/maintenance on this repo is done on the dev branch.