怎么把数据保存为CSV文件？

Boris-code / feapder

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单，功能强大的Python爬虫框架。内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。且支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度

http://feapder.com

Other

2.88k stars 476 forks source link

怎么把数据保存为CSV文件？ #216

Closed liuchangfu closed 1 year ago

liuchangfu commented 1 year ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Boris-code commented 1 year ago

看下https://feapder.com/#/source_code/pipeline

liuchangfu commented 1 year ago

看下https://feapder.com/#/source_code/pipeline

看得不是很明白，那两个参数是怎么传过来的？能解释一下吗？

Boris-code commented 1 year ago

框架传过来的，你需要做的是

setting里配置

# 数据入库的pipeline，支持多个
ITEM_PIPELINES = [
"pipeline.Pipeline" # pipeline文件名.类名
]

实现pipeline


from feapder.pipelines import BasePipeline
from typing import Dict, List, Tuple

class Pipeline(BasePipeline): """ pipeline 是单线程的，批量保存数据的操作，不建议在这里写网络请求代码，如下载图片等 """

def save_items(self, table, items: List[Dict]) -> bool:
    """
    保存数据
    Args:
        table: 表名
        items: 数据，[{},{},...]

    Returns: 是否保存成功 True / False
             若False，不会将本批数据入到去重库，以便再次入库

    """

    这里保存文件

    return True

liuchangfu commented 1 year ago

研究出来了，谢谢。这个框架比那个scrapy好用多了。文档也写得很清晰，易学，容易上手。