luyijingxiu / douyin_comment_spider

给定一个关键词,爬取相关抖音视频以及一级评论
43 stars 5 forks source link

代码无法正常爬取到信息 #1

Open evanlee2021 opened 1 year ago

evanlee2021 commented 1 year ago

您好!我在我的电脑上运行了一下,结果是这样的,请问这是哪里出问题了呢? image

LikeStrangersDo commented 1 year ago

我今天也运行了一次,到登陆界面都没问题,到了存储url链接的时候出错`def begin_search(browser: WebDriver, keyword: str, expect_search_result_num: int, publish_time: int, sort_type: int): req_url = f"{tik_tok_prefix_url}/search/{keyword}?publish_time={publish_time}&sort_type={sort_type}&source=tab_search&type=video"

browser.get(req_url)
time.sleep(2)
spider_util.dy_login(browser)

i = 1

video_ur_list = []
while i <= expect_search_result_num:

    video_div_xpath = f'//*[@id="douyin-right-container"]/div[2]/div/div[3]/div[2]/ul/li[{i}]'
    video_url_info_xpath = f'//*[@id="douyin-right-container"]/div[2]/div/div[3]/div[2]/ul/li[{i}]/div/a'

    WebDriverWait(browser, 30).until(
        lambda driver: spider_util.find_element_silent(driver, video_div_xpath) is not None)

    video = spider_util.find_element_silent(browser, video_div_xpath)
    if video is None:
        print(f"未发现视频,索引:{i}")
        i = i + 1
        continue
    browser.execute_script("arguments[0].scrollIntoView();", video)

    video_url_info = spider_util.find_element_silent(browser, video_url_info_xpath)
    if video_url_info is None:
        print(f"视频获取错误,索引: {i}")
        i = i + 1
        continue
    print(video_url_info.get_attribute("href"))
    video_ur_list.append(video_url_info.get_attribute("href"))
    i = i + 1

file_path = f"{file_save_path}/search/{keyword}"
if not os.path.exists(file_path):
    os.makedirs(file_path)

file_name = 'video_url_list.json'

with open(f"{file_path}/{file_name}", 'w', encoding='UTF-8') as file:
    file.write(json.dumps(video_ur_list, indent=3, ensure_ascii=False))
    file.close()
browser.close()`

返回错误: Traceback (most recent call last): File "/Users/malon/PycharmProjects/pythonProject/dy_comment_spider-master/dyspider.py", line 38, in <module> dy_search.save_searched_video_list_data(browser, "伦敦大学") File "/Users/malon/PycharmProjects/pythonProject/dy_comment_spider-master/dy_search.py", line 77, in save_searched_video_list_data video_list = json.loads(video_list_json) File "/Users/malon/opt/anaconda3/lib/python3.9/json/__init__.py", line 346, in loads return _default_decoder.decode(s) File "/Users/malon/opt/anaconda3/lib/python3.9/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/Users/malon/opt/anaconda3/lib/python3.9/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)