wings-xue / ogspider

爬虫平台
0 stars 0 forks source link

从数据库初始化request到scheduler #2

Closed wings-xue closed 3 years ago

wings-xue commented 4 years ago

https://github.com/wings-xue/ogspider/blob/aa8507438b73148466fc115bad115a20897003ce/spider/spider.go#L26-L36

wings-xue commented 4 years ago

future: 支持爬取过程重新恢复

初始化

  1. 取数据库获取job。 后续pipeline解析出来的job直接存入scheduler,如果设置本地化,同时存入数据
  2. job转request
  3. spider将request传给engine
  4. engine将request更新到scheduler