istresearch / scrapy-cluster

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
http://scrapy-cluster.readthedocs.io/
MIT License
1.18k stars 324 forks source link

Zookeeper watcher's behavior is different on python 3 #128

Closed gas1121 closed 7 years ago

gas1121 commented 7 years ago

On python 2.7, when I runpython example_zw.py -z zookeeper:2181 -f "/" --event, I got:

('The valid state is now', True)
Use a keyboard interrupt to shut down the process.

On python 3.6, when I runpython example_zw.py -z zookeeper:2181 -f "/" --event, I got:

Your file contents: b''
The valid state is now True
Use a keyboard interrupt to shut down the process.

The config_handler will be called once when zookeeper watcher starts, so in crawler there will be Zookeeper config wiped and Lost config from Zookeeper in log

madisonb commented 7 years ago

Does this recover once the crawler boots up? So the workflow is:

gas1121 commented 7 years ago

the workflow is:

after that crawler seems to work as normal

madisonb commented 7 years ago

Hmm, I would need to compare it to the crawler output from 2.7, if that is indeed the log messages the ZookeeperWatcher class may not be python 3 compatible, can we reproduce the error within example_zw.py? If so, this means the zookeeper watcher class wont behave properly and it needs to be addressed. Otherwise if that example script works as expected, the crawler should too.

gas1121 commented 7 years ago

Yes, this error can be reproduced within example_zw.py.

madisonb commented 7 years ago

Ok, so that is going to be a blocker on #126 until we get that resolved, if the examples don't function the way they are supposed to we will lose functionality, and it is going to cripple the python3 version of the project.

madisonb commented 7 years ago

This has been corrected, closing.