istresearch / scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
http://scrapy-cluster.readthedocs.io/
MIT License
1.18k stars 324 forks source link

Kafka startup fails - Docker compose #112

Closed thebennos closed 7 years ago

thebennos commented 7 years ago

Hey The docker-compose setup fails. the docker kafka container stops working with output. org.apache.kafka.common.config.ConfigException: Invalid value for configuration advertised.port: Not a number of type INT

Found this https://github.com/wurstmeister/kafka-docker/issues/81

there is a missing enviroment variables:

this needs to be in the compose file attached to the kafka service

environment: KAFKA_ADVERTISED_HOST_NAME: "kafka" KAFKA_ADVERTISED_PORT: "9092"

madisonb commented 7 years ago

Can you give me better steps to reproduce this? The compose file here is what runs on Travis to execute the docker container integration tests like the one here and seems to do just fine. Its extremely similar to the one located in the base of the project.

I also just did a clean pull of this project, ran the following

$ docker-compose pull
$ docker-compose up -d
# run a vanilla command to test if things are working
$ curl -XPOST localhost:5343/feed -d '{"action":"info", "appid":"test", "uuid":"abc123", "spiderid":"link"}' -H "Content-type:application/json"
{
  "data": {
    "appid": "test",
    "crawlids": {},
    "server_time": 1491180169,
    "spiderid": "link",
    "total_crawlids": 0,
    "total_domains": 0,
    "total_pending": 0,
    "uuid": "abc123"
  },
  "error": null,
  "status": "SUCCESS"
}

The command above uses the rest service to write data to a kafka topic, which the kafka monitor reads and writes the request into redis, where it is then executed on by the redis monitor, then the data is written back to kafka, and returned via the rest service.

Everything seems to be talking fine, so I would be curious to know what is causing issues.

thebennos commented 7 years ago

Can not really say why this happens. Appears after deployment. I use my own rancher (version 1.5) setup with 8 live server and it do not start clear without the additional enviroment variables. Appears just the error message above and I made a quick research.

I think it is not a problem to add the additional enviroment variables. Maybe it is needed in some cases or docker setup's.

madisonb commented 7 years ago

Do you have steps to resolve this issue so I can close it? The docker-compose.yml files provided by this project are not meant to represent a production level configuration, just one to help get users started.

If that's what your experience is I would like to close this.