发布任务封装API接口

crawlab-team / crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

https://www.crawlab.cn

BSD 3-Clause "New" or "Revised" License

11.39k stars 1.8k forks source link

发布任务封装API接口 #436

Closed Tang-1 closed 4 years ago

Tang-1 commented 4 years ago

（这可能是一个伪需求）当前scrapy爬虫设计使用传参模式
单条任务运行命令 scrapy crawl spiderName -a param1= param2= 当大批量发布不同传参任务时会直接弃用crawlab

crawlab中的界面我认为是最友好的所以想把更多操作通过crawlab来实现感谢你们的付出

tikazyq commented 4 years ago

后面的开发会更好支持scrapy任务，感谢支持

tikazyq commented 4 years ago

可以通过设置 scrapy crawl spiderName 为执行命令，参数为 -a param=value 来解决

	张冶青 Yeqing Zhang 邮箱：tikazyq@163.com

签名由网易邮箱大师定制

在2020年01月07日 15:30，Tang 写道：

（这可能是一个伪需求）当前scrapy爬虫设计使用传参模式单条任务运行命令 scrapy crawl spiderName -a param1= param2= 当大批量发布不同传参任务时会直接弃用crawlab

crawlab中的界面我认为是最友好的所以想把更多操作通过crawlab来实现感谢你们的付出

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Tang-1 commented 4 years ago

目前就是用的这种方式来启动但我有上千个value 需要执行上千次...

tikazyq commented 4 years ago

目前就是用的这种方式来启动但我有上千个value 需要执行上千次...

其实你是需要一个批量执行的方法对么，建议在 start_requests 方法里取值，然后调用 parse 方法作为回调。

Tang-1 commented 4 years ago

是的代码由公司前辈完成的，已经上线很久了，不方便做出新的改动。如果crawlab开发比较麻烦就算了，这需求本身并重要。

------------------ 原始邮件 ------------------ 发件人: "Marvin Zhang"<notifications@github.com>; 发送时间: 2020年1月15日(星期三) 中午11:06 收件人: "crawlab-team/crawlab"<crawlab@noreply.github.com>; 抄送: "."<2300309546@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [crawlab-team/crawlab] 发布任务封装API接口 (#436)

目前就是用的这种方式来启动但我有上千个value 需要执行上千次...

其实你是需要一个批量执行的方法对么，建议在 start_requests 方法里取值，然后调用 parse 方法作为回调。

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.