dataabc / weibo-crawler

新浪微博爬虫,用python爬取新浪微博数据,并下载微博图片和微博视频
3.45k stars 768 forks source link

您好,每次抓到第20页就停止,并且文件里没有生成抓取内容的csv文件,只有users.csv #91

Open AaronTianzi opened 4 years ago

AaronTianzi commented 4 years ago

------------------------------已获取央视新闻(2656274875)的第20页微博------------------------------ ('Error: ', UnicodeDecodeError('ascii', '/home/tz/weibo-crawler(\xe5\x91\xa8\xe7\x8b\x97)/weibo/', 23, 24, 'ordinal not in range(128)')) Traceback (most recent call last): File "weibo.py", line 689, in get_filepath )[0] + os.sep + 'weibo' + os.sep + self.user['screen_name'] UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 23: ordinal not in range(128) Progress: 0%| | 19/12753 [15:47<176:28:54, 49.89s/it] ('Error: ', TypeError('coercing to Unicode: need string or buffer, NoneType found',)) Traceback (most recent call last): File "weibo.py", line 997, in get_pages self.write_data(wrote_count) File "weibo.py", line 960, in write_data self.write_csv(wrote_count) File "weibo.py", line 721, in write_csv self.csv_helper(result_headers, result_data, file_path) File "weibo.py", line 725, in csv_helper if not os.path.isfile(file_path): File "/usr/lib/python2.7/genericpath.py", line 37, in isfile st = os.stat(path) TypeError: coercing to Unicode: need string or buffer, NoneType found 信息抓取完毕


dataabc commented 4 years ago

感谢反馈。

看代码好像不是最新版,你可以更新到最新版。如果还有问题,可以换成python3。之所以到20页就出错是因为本程序默认每20页保存一次数据,而写入csv时,os.path.isfile(file_path)中的file_path为空,出错。更新到最新版或许可以解决问题。

如果还有问题,欢迎继续反馈。

AaronTianzi commented 4 years ago

谢谢!我没有把代码更新到最新版本,换成python3后就解决了,感谢!