deedy5 / duckduckgo_search

Search for words, documents, images, videos, news, maps and text translation using the DuckDuckGo.com search engine. Downloading files and images to a local hard drive.
MIT License
932 stars 117 forks source link

Timeout error when trying image search from CLI or Python #130

Closed drscotthawley closed 7 months ago

drscotthawley commented 7 months ago

Describe the bug "TimeoutError: timed out" coming from the attempt at an http socket connection. This happens whether calling from Python or using the DDG CLI. Running pip install -U duckduckgo_search does not fix this.

Debug log CLI version, using the suggested search from the README:

$ ddgs images -k "yuri kuklachev cat theatre" -m 500 -s off -d
Traceback (most recent call last):
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 206, in connect_tcp
    sock = socket.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py", line 851, in create_connection
    raise exceptions[0]
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py", line 836, in create_connection
    sock.connect(sa)
TimeoutError: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py", line 66, in map_httpcore_exceptions
    yield
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py", line 228, in handle_request
    resp = self._pool.handle_request(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 268, in handle_request
    raise exc
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 251, in handle_request
    response = connection.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 99, in handle_request
    raise exc
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 76, in handle_request
    stream = self._connect(request)
             ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 124, in _connect
    stream = self._network_backend.connect_tcp(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 205, in connect_tcp
    with map_exceptions(exc_map):
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectTimeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/myusername/envs/blog/bin/ddgs", line 8, in <module>
    sys.exit(cli())
             ^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 1719, in invoke
    rv.append(sub_ctx.command.invoke(sub_ctx))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/cli.py", line 249, in images
    for r in DDGS(proxies=proxy).images(
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 351, in images
    vqd = self._get_vqd(keywords)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 59, in _get_vqd
    resp = self._get_url("POST", "https://duckduckgo.com", data={"q": keywords})
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 54, in _get_url
    raise ex
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py", line 45, in _get_url
    resp = self._client.request(method, url, follow_redirects=True, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 814, in request
    return self.send(request, auth=auth, follow_redirects=follow_redirects)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 901, in send
    response = self._send_handling_auth(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 929, in _send_handling_auth
    response = self._send_handling_redirects(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 966, in _send_handling_redirects
    response = self._send_single_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_client.py", line 1002, in _send_single_request
    response = transport.handle_request(request)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py", line 227, in handle_request
    with map_httpcore_exceptions():
  File "/opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py", line 155, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/Users/myusername/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py", line 83, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectTimeout: timed out

Python version: search_results = ddgs.images(keywords="dog images"):

---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File ~/envs/blog/lib/python3.11/site-packages/httpcore/_exceptions.py:10, in map_exceptions(map)
      9 try:
---> 10     yield
     11 except Exception as exc:  # noqa: PIE786

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_backends/sync.py:206, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options)
    205 with map_exceptions(exc_map):
--> 206     sock = socket.create_connection(
    207         address,
    208         timeout,
    209         source_address=source_address,
    210     )
    211     for option in socket_options:

File /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py:851, in create_connection(address, timeout, source_address, all_errors)
    850 if not all_errors:
--> 851     raise exceptions[0]
    852 raise ExceptionGroup("create_connection failed", exceptions)

File /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/socket.py:836, in create_connection(address, timeout, source_address, all_errors)
    835     sock.bind(source_address)
--> 836 sock.connect(sa)
    837 # Break explicitly a reference cycle

TimeoutError: timed out

The above exception was the direct cause of the following exception:

ConnectTimeout                            Traceback (most recent call last)
File ~/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py:66, in map_httpcore_exceptions()
     65 try:
---> 66     yield
     67 except Exception as exc:

File ~/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py:228, in HTTPTransport.handle_request(self, request)
    227 with map_httpcore_exceptions():
--> 228     resp = self._pool.handle_request(req)
    230 assert isinstance(resp.stream, typing.Iterable)

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py:268, in ConnectionPool.handle_request(self, request)
    267         self.response_closed(status)
--> 268     raise exc
    269 else:

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py:251, in ConnectionPool.handle_request(self, request)
    250 try:
--> 251     response = connection.handle_request(request)
    252 except ConnectionNotAvailable:
    253     # The ConnectionNotAvailable exception is a special case, that
    254     # indicates we need to retry the request on a new connection.
   (...)
    258     # might end up as an HTTP/2 connection, but which actually ends
    259     # up as HTTP/1.1.

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py:99, in HTTPConnection.handle_request(self, request)
     98         self._connect_failed = True
---> 99         raise exc
    100 elif not self._connection.is_available():

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py:76, in HTTPConnection.handle_request(self, request)
     75 try:
---> 76     stream = self._connect(request)
     78     ssl_object = stream.get_extra_info("ssl_object")

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_sync/connection.py:124, in HTTPConnection._connect(self, request)
    123 with Trace("connect_tcp", logger, request, kwargs) as trace:
--> 124     stream = self._network_backend.connect_tcp(**kwargs)
    125     trace.return_value = stream

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_backends/sync.py:205, in SyncBackend.connect_tcp(self, host, port, timeout, local_address, socket_options)
    200 exc_map: ExceptionMapping = {
    201     socket.timeout: ConnectTimeout,
    202     OSError: ConnectError,
    203 }
--> 205 with map_exceptions(exc_map):
    206     sock = socket.create_connection(
    207         address,
    208         timeout,
    209         source_address=source_address,
    210     )

File /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py:155, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    154 try:
--> 155     self.gen.throw(typ, value, traceback)
    156 except StopIteration as exc:
    157     # Suppress StopIteration *unless* it's the same exception that
    158     # was passed to throw().  This prevents a StopIteration
    159     # raised inside the "with" statement from being suppressed.

File ~/envs/blog/lib/python3.11/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map)
     13     if isinstance(exc, from_exc):
---> 14         raise to_exc(exc) from exc
     15 raise

ConnectTimeout: timed out

The above exception was the direct cause of the following exception:

ConnectTimeout                            Traceback (most recent call last)
Cell In[17], line 15
     12         return image_urls
     14 # example usage:
---> 15 urls = search_images("dog images", max_images=10)
     16 urls

Cell In[17], line 10, in search_images(term, max_images)
      6 with DDGS() as ddgs:
      7     # generator which yields dicts with:
      8     # {'title','image','thumbnail','url','height','width','source'}
      9     search_results = ddgs.images(keywords=term) # returns a generator
---> 10     image_urls = [next(search_results).get("image") for _ in range(max_images)]
     11     # convert to L (functionally extended list class from fastai)
     12     return image_urls

Cell In[17], line 10, in <listcomp>(.0)
      6 with DDGS() as ddgs:
      7     # generator which yields dicts with:
      8     # {'title','image','thumbnail','url','height','width','source'}
      9     search_results = ddgs.images(keywords=term) # returns a generator
---> 10     image_urls = [next(search_results).get("image") for _ in range(max_images)]
     11     # convert to L (functionally extended list class from fastai)
     12     return image_urls

File ~/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:351, in DDGS.images(self, keywords, region, safesearch, timelimit, size, color, type_image, layout, license_image, max_results)
    326 """DuckDuckGo images search. Query params: https://duckduckgo.com/params
    327 
    328 Args:
   (...)
    347 
    348 """
    349 assert keywords, "keywords is mandatory"
--> 351 vqd = self._get_vqd(keywords)
    352 assert vqd, "error in getting vqd"
    354 safesearch_base = {"on": 1, "moderate": 1, "off": -1}

File ~/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:59, in DDGS._get_vqd(self, keywords)
     57 def _get_vqd(self, keywords: str) -> Optional[str]:
     58     """Get vqd value for a search query."""
---> 59     resp = self._get_url("POST", "https://duckduckgo.com/", data={"q": keywords})
     60     if resp:
     61         return _extract_vqd(resp.content)

File ~/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:54, in DDGS._get_url(self, method, url, **kwargs)
     52     logger.warning(f"_get_url() {url} {type(ex).__name__} {ex}")
     53     if i >= 2 or "418" in str(ex):
---> 54         raise ex
     55 sleep(3)

File ~/envs/blog/lib/python3.11/site-packages/duckduckgo_search/duckduckgo_search.py:45, in DDGS._get_url(self, method, url, **kwargs)
     43 for i in range(3):
     44     try:
---> 45         resp = self._client.request(method, url, follow_redirects=True, **kwargs)
     46         if _is_500_in_url(str(resp.url)) or resp.status_code == 202:
     47             raise httpx._exceptions.HTTPError("")

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:814, in Client.request(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
    799     warnings.warn(message, DeprecationWarning)
    801 request = self.build_request(
    802     method=method,
    803     url=url,
   (...)
    812     extensions=extensions,
    813 )
--> 814 return self.send(request, auth=auth, follow_redirects=follow_redirects)

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:901, in Client.send(self, request, stream, auth, follow_redirects)
    893 follow_redirects = (
    894     self.follow_redirects
    895     if isinstance(follow_redirects, UseClientDefault)
    896     else follow_redirects
    897 )
    899 auth = self._build_request_auth(request, auth)
--> 901 response = self._send_handling_auth(
    902     request,
    903     auth=auth,
    904     follow_redirects=follow_redirects,
    905     history=[],
    906 )
    907 try:
    908     if not stream:

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:929, in Client._send_handling_auth(self, request, auth, follow_redirects, history)
    926 request = next(auth_flow)
    928 while True:
--> 929     response = self._send_handling_redirects(
    930         request,
    931         follow_redirects=follow_redirects,
    932         history=history,
    933     )
    934     try:
    935         try:

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:966, in Client._send_handling_redirects(self, request, follow_redirects, history)
    963 for hook in self._event_hooks["request"]:
    964     hook(request)
--> 966 response = self._send_single_request(request)
    967 try:
    968     for hook in self._event_hooks["response"]:

File ~/envs/blog/lib/python3.11/site-packages/httpx/_client.py:1002, in Client._send_single_request(self, request)
    997     raise RuntimeError(
    998         "Attempted to send an async request with a sync Client instance."
    999     )
   1001 with request_context(request=request):
-> 1002     response = transport.handle_request(request)
   1004 assert isinstance(response.stream, SyncByteStream)
   1006 response.request = request

File ~/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py:227, in HTTPTransport.handle_request(self, request)
    213 assert isinstance(request.stream, SyncByteStream)
    215 req = httpcore.Request(
    216     method=request.method,
    217     url=httpcore.URL(
   (...)
    225     extensions=request.extensions,
    226 )
--> 227 with map_httpcore_exceptions():
    228     resp = self._pool.handle_request(req)
    230 assert isinstance(resp.stream, typing.Iterable)

File /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/contextlib.py:155, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
    153     value = typ()
    154 try:
--> 155     self.gen.throw(typ, value, traceback)
    156 except StopIteration as exc:
    157     # Suppress StopIteration *unless* it's the same exception that
    158     # was passed to throw().  This prevents a StopIteration
    159     # raised inside the "with" statement from being suppressed.
    160     return exc is not value

File ~/envs/blog/lib/python3.11/site-packages/httpx/_transports/default.py:83, in map_httpcore_exceptions()
     80     raise
     82 message = str(exc)
---> 83 raise mapped_exc(message) from exc

ConnectTimeout: timed out

Screenshots

Screenshot 2023-11-09 at 2 43 03 PM

Specify this information

$ env
MANPATH=/opt/homebrew/share/man:
TERM_PROGRAM=Apple_Terminal
GEM_HOME=/Users/shawley/gems
SHELL=/bin/bash
TERM=xterm-256color
KMP_DUPLICATE_LIB_OK=TRUE
HOMEBREW_REPOSITORY=/opt/homebrew
TMPDIR=/var/folders/5s/dkk8t0jn5fv6df9f68j9xddr0000gn/T/
LIBRARY_PATH=/opt/homebrew/lib
PYTHONUNBUFFERED=1
TERM_PROGRAM_VERSION=447
OLDPWD=/Users/shawley/github/blog
TERM_SESSION_ID=87E738C7-EE2E-4ADA-8D3A-93A29BF86763
USER=shawley
CPATH=/opt/homebrew/include
SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.J7IfhT2guA/Listeners
PYTORCH_ENABLE_MPS_FALLBACK=1
WINEARCH=win32
BASH_SILENCE_DEPRECATION_WARNING=1
VIRTUAL_ENV=/Users/shawley/envs/blog
LSCOLORS=gxfxcxdxbxegedabagacad
PATH=/Users/shawley/envs/blog/bin:/Users/shawley/.cargo/bin:/Users/shawley/gems/bin:/opt/homebrew/bin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/TeX/texbin:/opt/X11/bin:/Applications/quarto/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin
LaunchInstanceID=21AB5167-758E-4CC1-AC72-202BB19C92C4
__CFBundleIdentifier=com.apple.Terminal
PWD=/Users/shawley/github/blog/posts
LANG=en_US.UTF-8
XPC_FLAGS=0x0
PS1=(blog) \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ 
CONDA_BASE=/Users/shawley/opt/anaconda3
XPC_SERVICE_NAME=0
SHLVL=1
HOME=/Users/shawley
HOMEBREW_PREFIX=/opt/homebrew
LOGNAME=shawley
INFOPATH=/opt/homebrew/share/info:
HOMEBREW_CELLAR=/opt/homebrew/Cellar
DISPLAY=/private/tmp/com.apple.launchd.00NKVjlcg4/org.xquartz:0
SECURITYSESSIONID=186b3
VIRTUAL_ENV_PROMPT=(blog) 
_=/usr/bin/env
drscotthawley commented 7 months ago

Ah, seems DDG itself is down right now. Can't connect to web site. (Any/all other non-DDG URLs I try work fine so it's not my network)

Screenshot 2023-11-09 at 3 03 33 PM

Will retry code in a few hours 🤞

deedy5 commented 7 months ago

Thank you for finding the problem! Fixed in v3.9.5