ludoux / ngapost2md

艾泽拉斯国家地理论坛/NGA玩家社区/NGA单帖爬虫/牛国安一键存贴,不怕冲水
MIT License
102 stars 9 forks source link

Posts of the main zone cannot be processed. #5

Closed crella6 closed 4 years ago

crella6 commented 4 years ago

windows cmd>python nga_20200623.py tid:22959845 22959845 localmaxpage1 localmaxfloor-1 trypage1 Traceback (most recent call last): File "nga_20200623.py", line 185, in main() File "nga_20200623.py", line 144, in main holder() File "nga_20200623.py", line 163, in holder while single(cpage) != False: File "nga_20200623.py", line 41, in single usertext = re.search(r',"U":(.+?),"R":', content, flags=re.S).group(1) AttributeError: 'NoneType' object has no attribute 'group'

I am a Chinese user.貌似网事杂谈区主区的很多帖子都抓取不了,提示错误如上;但是分区的帖子可以抓取。建议多从主区找帖子来试试。

ludoux commented 4 years ago

你好。 由于我当下用不了电脑,所以没有办法复现。但是有两个可能的排错方向:

  1. 请问一下 cookie 是否配置正确,马桶区基本都要登录才能看,而分区基本不用。假如 cookie 配置错误的话可能会无法以登录状态抓取数据,而脚本目前没有针对此情况有处理,会直接抛出异常。
  2. 请问帖子状态是正常状态(而不是被锁定)的么,当帖子被锁定后,也会出现这种直接异常。

(前几天我还从水区存了帖子,我个人感觉可能是 cookie 的问题😅,我过几天有可能会针对没有登录、帖子被锁定写一下错误输出)


From: crella6 notifications@github.com Sent: Friday, August 14, 2020 5:07:24 PM To: ludoux/ngapost2md ngapost2md@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [ludoux/ngapost2md] Posts of the main zone cannot be processed. (#5)

windows cmd>python nga_20200623.py tid:22959845 22959845 localmaxpage1 localmaxfloor-1 trypage1 Traceback (most recent call last): File "nga_20200623.py", line 185, in main() File "nga_20200623.py", line 144, in main holder() File "nga_20200623.py", line 163, in holder while single(cpage) != False: File "nga_20200623.py", line 41, in single usertext = re.search(r',"U":(.+?),"R":', content, flags=re.S).group(1) AttributeError: 'NoneType' object has no attribute 'group'

I am a Chinese user.貌似网事杂谈区主区的很多帖子都抓取不了,提示错误如上;但是分区的帖子可以抓取。建议多从主区找帖子来试试。

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/ludoux/ngapost2md/issues/5, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADGWKTGYU7ANWAP6AIMEVTTSAT5EZANCNFSM4P7JMSKQ.

crella6 commented 4 years ago

啊不好意思,确实是我忘记填cookie信息了,我也很纳闷一直看不懂正则表达式有什么错。