lixiang0 / WEB_KG

爬取百度百科中文页面,抽取三元组信息,构建中文知识图谱
http://kg.rubenxiao.com
928 stars 189 forks source link

请问一下在执行scrapy crawl baike后,没有一点报错但直接跳出程序是什么情况 #27

Open lpscq opened 1 year ago

lpscq commented 1 year ago

结果是这个:(base) D:\bandzip\WEB_KG-master\baike>d:/ProgramData/Anaconda3/python.exe d:/bandzip/WEB_KG-master/baike/spiders/baike.py 感觉好像没有爬取到内容,是我什么地方出错了吗。

lixiang0 commented 1 year ago

20 看这个issue

yingxiaolu commented 1 year ago

下午跑了下发现百度百科加验证了,不好爬取

sunshinezhihuo commented 1 year ago

下午跑了下发现百度百科加验证了,不好爬取

大佬,我出现的是这个,你也是吗?啥数据都没爬下来 2023-07-03 13:00:23 [py.warnings] WARNING: C:\Users\Q.conda\envs\q\lib\site-packages\scrapy\utils\request.py:232: ScrapyDeprecationWarning: '2.6' is a deprecated value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting.

It is also the default value. In other words, it is normal to get this warning if you have not defined a value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting. This is so for backward compatibility reasons, but it will change in a future version of Scrapy.

See the documentation of the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting for information on how to handle this deprecation. return cls(crawler)

yingxiaolu commented 1 year ago

下午跑了下发现百度百科加验证了,不好爬取

大佬,我出现的是这个,你也是吗?啥数据都没爬下来 2023-07-03 13:00:23 [py.warnings] WARNING: C:\Users\Q.conda\envs\q\lib\site-packages\scrapy\utils\request.py:232: ScrapyDeprecationWarning: '2.6' is a deprecated value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting.

It is also the default value. In other words, it is normal to get this warning if you have not defined a value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting. This is so for backward compatibility reasons, but it will change in a future version of Scrapy.

See the documentation of the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting for information on how to handle this deprecation. return cls(crawler)

你这报错了,百度防爬会返回一个验证页面,看response内容

TomCat552 commented 10 months ago

结果是这个:(base) D:\bandzip\WEB_KG-master\baike>d:/ProgramData/Anaconda3/python.exe d:/bandzip/WEB_KG-master/baike/spiders/baike.py 好像感觉没有爬取到内容,是我什么地方出错了。

baike/settings.py 这个里面DEFAULT_REQUEST_HEADERS 换一个ua试试

TomCat552 commented 10 months ago

下午跑了下发现百度百科加验证了,不好爬取

大佬,我出现的是这个,你也是吗?啥数据都爬下来了 2023-07-03 13:00:23 [py.warnings] WARNING: C:\Users\Q.conda\envs\q\lib \site-packages\scrapy\utils\request.py:232:ScrapyDeprecationWarning:“2.6”是“REQUEST_FINGERPRINTER_IMPLMENTATION”设置的已弃用值。

这也是默认值。换句话说,如果您没有为“REQUEST_FINGERPRINTER_IMPLMENTATION”设置定义值,则收到此警告是正常的。这样做是出于向后兼容性的原因,但在 Scrapy 的未来版本中将会改变。

有关如何处理此弃用的信息,请参阅“REQUEST_FINGERPRINTER_IMPLMENTATION”设置的文档。 返回cls(爬虫)

baike/settings.py 这个里面DEFAULT_REQUEST_HEADERS 换一个ua尝试

Bruce-Yue commented 5 months ago

我也遇到了这个问题,有解决方法了吗?