-
大佬您好,我又来咨询您了。我用您的这个代码爬了几个账号的微博内容,但是生成的csv文件中,数据有大量重复的,比如我爬取了微博@精品购物指南 ,他的微博是2.8w条,但是csv里的数据却有4.34w条,也就是说重复了很多。请问这是什么原因呢?感谢您。
-
微博内容精选
-
File "C:/Users/23318/PycharmProjects/Crawler/machineLearing/old/Learning2Ask_TypedDecoder-master/STD_code_final/main.py", line 43, in load_data
with open('%s/%s.post' % (path, fname)) as f:
Fil…
-
Extra data: line 128 column 2 (char 4697)
Traceback (most recent call last):
File "weibo.py", line 766, in get_one_weibo
weibo = self.get_long_weibo(weibo_id)
File "weibo.py", line 351, in…
-
D:\Desktop\爬虫\weibo-search-master>scrapy crawl search
Traceback (most recent call last):
File "C:\Users\lenovo\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_ma…
-
单个ID不存在问题,当读入的是txt文件时,会出现此错误
![image](https://user-images.githubusercontent.com/126164609/236419926-e3520615-4003-462e-8b2b-fdf8f0088e43.png)
-
weibo.py", line 21, in
import requests
ModuleNotFoundError: No module named 'requests' 这个错误怎么解决
-
weibo-crawler文件夹已经出现了,这个文件夹里面也有requirements.txt的文件,但运行pip install命令的时候出现了ERROR: Could not open requirements file: [Errno 2] No such file or directory: 'requirements.txt'。
我知道问题似乎出在没有在同一个目录上执行命令,但不知道该…
-
(rectflow) amax@amax:/data/lu/yyl/weibo-search/weibo-search-master$ scrapy crawl search -s JOBDIR=crawls/search -L INFO
[1]+ 已完成 scrapy crawl search -s JOBDIR=crawls/search
2023-04-08…
-
之前爬取正常,爬了60个账号左右报错:ValueError: time data 'Fri Jan 15 08:36:13 +0800 2021' does not match format '%Y-%m-%d',请教一下,应该如何修改呢?下面是all.log文件的内容
Progress: 0%| …