yuvenhol / dataharvest

AGI拓展工具,支持AI搜索&爬虫&数据清洗,开箱即用。tavily、天工、百度百科、百家号、360百科、头条、微信公众号、搜狐百科、腾讯新闻、网易新闻、马蜂窝、小红书
35 stars 4 forks source link

Timeout Error #2

Closed AlexZhou1995 closed 2 months ago

AlexZhou1995 commented 2 months ago
Traceback (most recent call last):
  File "/mnt/workspace/playground/AI_hackathon/test.py", line 13, in <module>
    docs = loop.run_until_complete(asyncio.gather(*tasks))
  File "/home/pai/envs/py3/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/mnt/workspace/playground/AI_hackathon/dataharvest/base.py", line 12, in a_crawl_and_purify
    doc = await self.spider.a_crawl(url)
  File "/mnt/workspace/playground/AI_hackathon/dataharvest/spider/spider.py", line 29, in a_crawl
    return await spider.a_crawl(url, config)
  File "/mnt/workspace/playground/AI_hackathon/dataharvest/spider/common_spider.py", line 50, in a_crawl
    await page.goto(url)
  File "/home/pai/envs/py3/lib/python3.10/site-packages/playwright/async_api/_generated.py", line 8657, in goto
    await self._impl_obj.goto(
  File "/home/pai/envs/py3/lib/python3.10/site-packages/playwright/_impl/_page.py", line 519, in goto
    return await self._main_frame.goto(**locals_to_params(locals()))
  File "/home/pai/envs/py3/lib/python3.10/site-packages/playwright/_impl/_frame.py", line 145, in goto
    await self._channel.send("goto", locals_to_params(locals()))
  File "/home/pai/envs/py3/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 59, in send
    return await self._connection.wrap_api_call(
  File "/home/pai/envs/py3/lib/python3.10/site-packages/playwright/_impl/_connection.py", line 514, in wrap_api_call
    raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None
playwright._impl._errors.TimeoutError: Page.goto: Timeout 30000ms exceeded.
Call log:
navigating to "https://www.voachinese.com/a/xi-urges-all-out-rescue-after-a-dike-breach-in-dongting-lake-20240706/7687420.html", waiting until "load"
yuvenhol commented 2 months ago

voachinese 墙内无法访问,你需要确保你配置了正确的代理。