Open wanghaisheng opened 7 years ago
waring : have checked unlink-subscription,catch forwards! 这里我做了判断,116行 ,检查标题是否存在,还有公众号和你设置的是否一致,你可以看下,或者直接注释掉。
(py2.7) ➜ catchWecaht git:(master) ✗ python dailydown.py
开始抓取公众号[weiyi_guahao]2017-09-28的文章:
user:津妈微助手name:weiyi_guahao
suceess : 抓取文章:微关注:天津市儿童医院挂号详解成功!
user:迪富信息name:weiyi_guahao
suceess : 抓取文章:微医上线首个健康电商平台,“微医严选”批量吸引健康机构合作成功!
user:代挂找我name:weiyi_guahao
suceess : 抓取文章:吉林大学第二医院网上挂号预约挂号成功!
user:报刊头条name:weiyi_guahao
suceess : 抓取文章:名医挂号难?教你一秒钟挂上京沪赣耳鼻喉名医稀缺专家号!成功!
user:广东省泗安医院name:weiyi_guahao
suceess : 抓取文章:【微科普】医用面膜哪里好,让你忘不了?成功!
user:老板商业体系name:weiyi_guahao
Traceback (most recent call last):
File "dailydown.py", line 277, in <module>
weixin_spider().run()
File "dailydown.py", line 74, in run
maincontent = self.get_list(self.search_url)
File "dailydown.py", line 86, in get_list
maincontent = self.get_content(list)
File "dailydown.py", line 135, in get_content
body = str(body).replace('data-src', 'src')
....
formatter))
File "/Users/wanghaisheng/anaconda/envs/py2.7/lib/python2.7/site-packages/bs4/element.py", line 1152, in decode
text = self.format_string(val, formatter)
File "/Users/wanghaisheng/anaconda/envs/py2.7/lib/python2.7/site-packages/bs4/element.py", line 167, in format_string
output = formatter(s)
File "/Users/wanghaisheng/anaconda/envs/py2.7/lib/python2.7/site-packages/bs4/element.py", line 124, in substitute_xml
ns, EntitySubstitution.substitute_xml)
File "/Users/wanghaisheng/anaconda/envs/py2.7/lib/python2.7/site-packages/bs4/element.py", line 108, in _substitute_if_appropriate
if (isinstance(ns, NavigableString)
RuntimeError: maximum recursion depth exceeded in __instancecheck__
(py2.7) ➜ catchWecaht git:(master) ✗
我想抓的公众号放到数据库里了