Boris-code / feapder

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度
http://feapder.com
Other
2.88k stars 476 forks source link

解析不了web,python3.11、feapder1.8.5 #223

Closed pigeon-fancier closed 1 year ago

pigeon-fancier commented 1 year ago

需知

升级feapder,保证feapder是最新版,若BUG仍然存在,则详细描述问题

pip install --upgrade feapder

解析不了web

File "D:\application\code-env\python3.11\Lib\site-packages\feapder\core\parser_control.py", line 568, in deal_request results = parser.parse(request, response) │ │ │ └ <Response [200]> │ │ └ <Request https://m.baidu.com/> │ └ <function SpiderTest.parse at 0x000001ACC6F16DE0> └ <SpiderTest(Thread-1, started 16256)>

import feapder

class SpiderTest(feapder.AirSpider):
    def start_requests(self):
        # for i in range(1, 15):
        yield feapder.Request("https://m.baidu.com/")

    def parse(self, request, response):
        response.encoding_errors = 'ignore'
        print(response.text)

    def parse_detail(self, request, response):
        """
        解析详情
        """
        # 取url
        url = request.url
        # 取title
        title = request.title
        # 解析正文
        content = response.xpath(
            'string(//div[@class="content"])'
        ).extract_first()  # string 表达式是取某个标签下的文本,包括子标签文本

        print("url", url)
        print("title", title)
        print("content", content)

if __name__ == "__main__":
    SpiderTest().start()
Boris-code commented 1 year ago

3.11 用 feapder==1.8.6b6

pip3 install feapder==1.8.6b6

后面会发布正式版