执行了 runSpider.py 过一段时间就不动了..

gccdChen commented 7 years ago

2017-02-14 11:29:18 [10], msg:sql helper execute command:CREATE TABLE IF NOT EXI STS free_ipproxy (id INT(8) NOT NULL AUTO_INCREMENT,ip CHAR(25) NOT NULL UNI QUE,port INT(4) NOT NULL,country TEXT DEFAULT NULL,anonymity INT(2) DEFAUL T NULL,https CHAR(4) DEFAULT NULL ,speed FLOAT DEFAULT NULL,source CHAR(20 ) DEFAULT NULL,save_time TIMESTAMP NOT NULL,PRIMARY KEY(id)) ENGINE=InnoDB 2017-02-14 11:29:19 [10], msg:***run spider waiting...**

awolfly9 commented 7 years ago

你好，你可以先检查下 runspider.py 中需要执行抓取的爬虫。 items = scrapydo.run_spider(XiCiDaiLiSpider) items = scrapydo.run_spider(SixSixIpSpider) items = scrapydo.run_spider(IpOneEightOneSpider) items = scrapydo.run_spider(KuaiDaiLiSpider) items = scrapydo.run_spider(GatherproxySpider)

如果有的话，可以查看日志 log/proxy.log 看下输出。最终显示 **run spider waiting...* 不动的原因是在等待下次抓取，调用了 time.sleep()

如果有问题欢迎回复。

祝愉快

2017-02-14 11:31 GMT+08:00 chen notifications@github.com:

2017-02-14 11:29:18 [10], msg:sql helper execute command:CREATE TABLE IF NOT EXI STS free_ipproxy (id INT(8) NOT NULL AUTO_INCREMENT,ip CHAR(25) NOT NULL UNI QUE,port INT(4) NOT NULL,country TEXT DEFAULT NULL,anonymity INT(2) DEFAUL T NULL,https CHAR(4) DEFAULT NULL ,speed FLOAT DEFAULT NULL,source CHAR(20 ) DEFAULT NULL,save_time TIMESTAMP NOT NULL,PRIMARY KEY(id)) ENGINE=InnoDB 2017-02-14 11:29:19 [10], msg:**run spider waiting...*

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/awolfly9/IPProxyTool/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/ALPxzTwLkRix_HL5SMK2-NyVgGAJWK7Jks5rcSAMgaJpZM4MABJ3 .

gccdChen commented 7 years ago

奥..5分钟更新一次.. 不过5个站点好少ip , 才155个.通过 douban 验证的才2个..

gccdChen commented 7 years ago

谢谢~

awolfly9 commented 7 years ago

目前只抓取了几个站点，后许会增加。通过验证的 ip 数量会随着时间的增加而增加。有用的 ip 会不断的保留。

awolfly9 / IPProxyTool

执行了 runSpider.py 过一段时间就不动了.. #1