ZouJiu1 / zhihu_spider_selenium

爬取知乎个人主页的想法、文篇和回答
MIT License
35 stars 12 forks source link

when failed to log in ,have no other methods to log in again. #10

Open direct-rule-form-londn opened 4 days ago

direct-rule-form-londn commented 4 days ago

我按照您的read me指示,首次运行了python crawler.py之后,成功进入了登录界面。但我在登录后匆忙关掉了界面,之后的步骤的抓取就失败了。 我重新执行python crawler.py之后,知乎弹出登录界面就在一两秒内关闭,没有办法登录。控制台的输出如下

`C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\crawler.py:1128: SyntaxWarning: invalid escape sequence '\m' driverpath = os.path.join(abspath, 'msedgedriver\msedgedriver.exe')

DevTools listening on ws://127.0.0.1:61658/devtools/browser/da3915e6-abb4-490a-bd17-dc75f38fcc03 需要登陆并保存cookie,下次就不用登录了。 Traceback (most recent call last): File "C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\crawler.py", line 1014, in login_loadsavecookie load_cookie(driver, cookie_path) File "C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\crawler.py", line 48, in load_cookie with open(path, 'rb') as cookiesfile: ^^^^^^^^^^^^^^^^ FileNotFoundError: [Errno 2] No such file or directory: 'C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\cookie\cookie_zhihu.pkl'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\crawler.py", line 1083, in zhihu driver, username = login_loadsavecookie() ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\crawler.py", line 1024, in login_loadsavecookie driver = login(driver) ^^^^^^^^^^^^^ File "C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\crawler.py", line 131, in login driver.find_elements(By.CLASS_NAME, "SignFlow-tab")[1].click()


IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\crawler.py", line 1176, in <module>
    zhihu()
  File "C:\Users\liang\Desktop\zhihu_project\zhihu_spider_selenium-master\crawler.py", line 1085, in zhihu
    os.remove(os.path.join(abspath, 'msedgedriver', "msedgedriver.exe"))
PermissionError: [WinError 5] アクセスが拒否されました。: 'C:\\Users\\liang\\Desktop\\zhihu_project\\zhihu_spider_selenium-master\\msedgedriver\\msedgedriver.exe'`

请问有什么解决方案吗
direct-rule-form-londn commented 4 days ago

顺便在这里问一句,抓取答案的功能会把答案的问题原文,赞同数与评论等其他信息也一同抓取吗?

ZouJiu1 commented 2 days ago

fixed, including question, up-vote and comment

direct-rule-form-londn commented 22 hours ago

fixed, including question, up-vote and comment

感恩,好吧,要是能抓取赞数和评论的话那就更好了