Le0nsec / SecCrawler

一个方便安全研究人员获取每日安全日报的爬虫和推送程序,目前爬取范围包括先知社区、安全客、Seebug Paper、跳跳糖、奇安信攻防社区、棱角社区以及绿盟、腾讯玄武、天融信、360等实验室博客,持续更新中。
GNU General Public License v3.0
889 stars 143 forks source link

先知抓取出错 #18

Closed firmianay closed 2 years ago

firmianay commented 2 years ago
$ uname -a                                      
Linux firmy-pc 5.13.0-37-generic #42~20.04.1-Ubuntu SMP Tue Mar 15 15:44:28 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

好像有点问题

[!] 2022/03/28 11:00:11 crawl [XianZhi] error: no such element: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/pre"}
  (Session info: headless chrome=99.0.4844.84)
  (Driver info: chromedriver=99.0.4844.51 (d537ec02474b5afe23684e7963d538896c63ac77-refs/branch-heads/4844@{#875}),platform=Linux 5.13.0-37-generic x86_64)

config.yml

cat config.yml 
ChromeDriver: ./chromedriver
Proxy:
  ProxyUrl: http://127.0.0.1:7890
  CrawlerProxyEnabled: false
  BotProxyEnabled: false
Cron:
  enabled: true
  time: 11
Api:
  enabled: false
  debug: false
  host: 127.0.0.1
  port: 8080
  auth: auth_key_here
Crawler:
  EdgeForum:
    enabled: true
  XianZhi:
    enabled: true
    UseChromeDriver: true
    CustomRSSURL: ""
  SeebugPaper:
    enabled: true
  Anquanke:
    enabled: true
  Tttang:
    enabled: true
  QiAnXin:
    enabled: true
  Lab:
    enabled: true
    NoahLab:
      enabled: true
    Blog360:
      enabled: true
    Nsfocus:
      enabled: true
    Xlab:
      enabled: true
    AlphaLab:
      enabled: true
    Netlab:
      enabled: true
    RiskivyBlog:
      enabled: true
    TSRCBlog:
      enabled: true
    X1cT34m:
      enabled: true
Bot:
  WecomBot:
    enabled: false
    key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    timeout: 2
  FeishuBot:
    enabled: true
    key: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
    timeout: 2
  DingBot:
    enabled: false
    token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    timeout: 2
  HexQBot:
    enabled: false
    api: http://xxxxxx.com/send
    qqgroup: 0
    key: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    timeout: 2
  ServerChan:
    enabled: false
    sendkey: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    timeout: 2
  WgpSecBot:
    enabled: false
    key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    timeout: 2
Le0nsec commented 2 years ago

是每次运行都这样吗,我这边没有出现这种问题

Le0nsec commented 2 years ago

然后就是看一下你的chrome版本和chromedriver版本是否一致

firmianay commented 2 years ago

差了一点点,这也有影响吗,两个都是目前最新版 chrome=99.0.4844.84 chromedriver=99.0.4844.51

Le0nsec commented 2 years ago

版本看起来应该没问题,大版本一致就可以,你可以再排查一下是否其他原因

一蓑烟雨 @.***>于2022年3月29日 周二16:16写道:

差了一点点,这也有影响吗,两个都是目前最新版 chrome=99.0.4844.84 chromedriver=99.0.4844.51

— Reply to this email directly, view it on GitHub https://github.com/Le0nsec/SecCrawler/issues/18#issuecomment-1081567423, or unsubscribe https://github.com/notifications/unsubscribe-auth/AP45Y4DC42DSNHVCVFASAFDVCK35PANCNFSM5RZYMBLQ . You are receiving this because you commented.Message ID: @.***>

firmianay commented 2 years ago

chrome 升级到 100 就好了,不知道为什么。 另外建议 chromedriver 为空的话可以默认到 PATH

Le0nsec commented 2 years ago

好的