Boris-code / feapder

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单,功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度
http://feapder.com
Other
2.96k stars 481 forks source link

无任务,爬虫结束 #191

Closed aneric16088 closed 1 year ago

aneric16088 commented 1 year ago

需知

升级feapder,保证feapder是最新版,若BUG仍然存在,则详细描述问题

pip install --upgrade feapder

问题就是显示无任务,爬虫结束

image

**import feapder

class SpiderTest(feapder.AirSpider): def start_requests(self): for i in range(1, 15): yield feapder.Request("https://www.qiushibaike.com/8hr/page/{}/".format(i))

def parse(self, request, response):
    article_list = response.xpath('//a[@class="recmd-content"]')
    for article in article_list:
        title = article.xpath("./text()").extract_first()
        url = article.xpath("./@href").extract_first()
        # print(title, url)

        yield feapder.Request(
            url, callback=self.parse_detail, title=title
        )  # callback 为回调函数

def parse_detail(self, request, response):
    """
    解析详情
    """
    # 取url
    url = request.url
    # 取title
    title = request.title
    # 解析正文
    content = response.xpath(
        'string(//div[@class="content"])'
    ).extract_first()  # string 表达式是取某个标签下的文本,包括子标签文本

    print("url", url)
    print("title", title)
    print("content", content)

if name == "main": SpiderTest().start()**

Boris-code commented 1 year ago

确定下 article_list = response.xpath('//a[@class="recmd-content"]') 是否取到了数据