searx / searx

Privacy-respecting metasearch engine
https://searx.github.io/searx/
GNU Affero General Public License v3.0
13.41k stars 1.71k forks source link

Google image and bing image no longer works #3528

Open MatthK opened 1 year ago

MatthK commented 1 year ago

Version of Searx, commit number if you are using on master branch and stipulate if you forked Searx Powered by searx - 1.1.0-69-75b859d2

How did you install Searx? I installed it as a docker container

What happened? When I try to run an image search, I get the warning red message on the right:

Engines cannot retrieve results:
google images (unexpected crash), bing images (HTTP error), flickr (unexpected crash)

How To Reproduce Do an image search

Expected behavior Expect to get images from Google, Bing and flickr as well

Screenshots & Logs image

ERROR:searx.search.processor.online:engine bing images : requests exception(search duration : 0.005860328674316406 s, timeout: 4.0 s) : HTTPSConnectionPool(host='www.bing.com', port=443): Max retries exceeded with url: /images/search?q=polar+bear&count=28&first=1&tsc=ImageHoverTitle (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f522fc4cc10>: Failed to establish a new connection: [Errno 111] Connection refused'))
Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/usr/lib/python3.9/site-packages/urllib3/util/connection.py", line 96, in create_connection
    raise err
  File "/usr/lib/python3.9/site-packages/urllib3/util/connection.py", line 86, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 382, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn
    conn.connect()
  File "/usr/lib/python3.9/site-packages/urllib3/connection.py", line 358, in connect
    conn = self._new_conn()
  File "/usr/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f522fc4cc10>: Failed to establish a new connection: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/usr/lib/python3.9/site-packages/urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3.9/site-packages/urllib3/util/retry.py", line 574, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.bing.com', port=443): Max retries exceeded with url: /images/search?q=polar+bear&count=28&first=1&tsc=ImageHoverTitle (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f522fc4cc10>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/searx/searx/search/processors/online.py", line 144, in search
    search_results = self._search_basic(query, params)
  File "/usr/local/searx/searx/search/processors/online.py", line 124, in _search_basic
    response = self._send_http_request(params)
  File "/usr/local/searx/searx/search/processors/online.py", line 96, in _send_http_request
    response = req(params['url'], **request_args)
  File "/usr/local/searx/searx/poolrequests.py", line 209, in get
    return request('get', url, **kwargs)
  File "/usr/local/searx/searx/poolrequests.py", line 181, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.9/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.9/site-packages/requests/adapters.py", line 565, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.bing.com', port=443): Max retries exceeded with url: /images/search?q=polar+bear&count=28&first=1&tsc=ImageHoverTitle (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f522fc4cc10>: Failed to establish a new connection: [Errno 111] Connection refused'))
ERROR:searx.search.processor.online:engine google images : exception : list index out of range
Traceback (most recent call last):
  File "/usr/local/searx/searx/search/processors/online.py", line 144, in search
    search_results = self._search_basic(query, params)
  File "/usr/local/searx/searx/search/processors/online.py", line 128, in _search_basic
    return self.engine.response(response)
  File "/usr/local/searx/searx/engines/google_images.py", line 193, in response
    pub_source = extract_text(pub_nodes[1])
IndexError: list index out of range
ERROR:searx.search.processor.online:engine flickr : exception : invalid literal for int() with base 10: 'data'
Traceback (most recent call last):
  File "/usr/local/searx/searx/search/processors/online.py", line 144, in search
    search_results = self._search_basic(query, params)
  File "/usr/local/searx/searx/search/processors/online.py", line 128, in _search_basic
    return self.engine.response(response)
  File "/usr/local/searx/searx/engines/flickr_noapi.py", line 79, in response
    photo = model_export['main'][index[0]][int(index[1])][index[2]][index[3]][int(index[4])]
ValueError: invalid literal for int() with base 10: 'data'

Additional context I get similar HTTP error from Brave, Gigablast and Bing when doing normal searches.

MatthK commented 1 year ago

Any ideas? Anyone?

tauceti82 commented 1 year ago

same issue here:

ERROR:searx.search.processor.online:engine flickr : exception : invalid literal for int() with base 10: 'data'
Traceback (most recent call last):
  File "/usr/local/searx/searx-src/searx/search/processors/online.py", line 144, in search
    search_results = self._search_basic(query, params)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/searx/searx-src/searx/search/processors/online.py", line 128, in _search_basic
    return self.engine.response(response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/searx/searx-src/searx/engines/flickr_noapi.py", line 79, in response
    photo = model_export['main'][index[0]][int(index[1])][index[2]][index[3]][int(index[4])]
                                                                              ^^^^^^^^^^^^^
ValueError: invalid literal for int() with base 10: 'data'
ERROR:searx.search.processor.online:engine google images : exception : list index out of range
Traceback (most recent call last):
  File "/usr/local/searx/searx-src/searx/search/processors/online.py", line 144, in search
    search_results = self._search_basic(query, params)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/searx/searx-src/searx/search/processors/online.py", line 128, in _search_basic
    return self.engine.response(response)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/searx/searx-src/searx/engines/google_images.py", line 193, in response
    pub_source = extract_text(pub_nodes[1])
                              ~~~~~~~~~^^^
IndexError: list index out of range
cancrizans commented 1 year ago

Identical repro here on 1.1.0-69-75b859d2. For google images, I think there's recent layout changes that might break the parsing in google-images.py, but it's a bit too dense for me to understand exactly what's going on