goldsmith / Wikipedia

A Pythonic wrapper for the Wikipedia API
https://wikipedia.readthedocs.org/
MIT License
2.89k stars 519 forks source link

ConnectionError: HTTPConnectionPool(host='en.wikipedia.org', port=80): Max retries exceeded with url: /w/api.php?list=search&srprop=&srlimit=10&limit=10&srsearch=Barack&format=json&action=query (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000023606F8AB38>: Failed to establish a new connection... #303

Open YixinNJU opened 2 years ago

YixinNJU commented 2 years ago

I can not use almost any function of wikipidia module. It always returns to me the connection error.

My original code is: wikipedia.search("Barack")

It gives me the error: `Traceback (most recent call last):

File "", line 1, in wikipedia.search("Barack")

File "F:\Anaconda3\anaconda3\lib\site-packages\wikipedia\util.py", line 28, in call ret = self._cache[key] = self.fn(*args, **kwargs)

File "F:\Anaconda3\anaconda3\lib\site-packages\wikipedia\wikipedia.py", line 103, in search raw_results = _wiki_request(search_params)

File "F:\Anaconda3\anaconda3\lib\site-packages\wikipedia\wikipedia.py", line 737, in _wiki_request r = requests.get(API_URL, params=params, headers=headers)

File "F:\Anaconda3\anaconda3\lib\site-packages\requests\api.py", line 75, in get return request('get', url, params=params, **kwargs)

File "F:\Anaconda3\anaconda3\lib\site-packages\requests\api.py", line 60, in request return session.request(method=method, url=url, **kwargs)

File "F:\Anaconda3\anaconda3\lib\site-packages\requests\sessions.py", line 533, in request resp = self.send(prep, **send_kwargs)

File "F:\Anaconda3\anaconda3\lib\site-packages\requests\sessions.py", line 646, in send r = adapter.send(request, **kwargs)

File "F:\Anaconda3\anaconda3\lib\site-packages\requests\adapters.py", line 516, in send raise ConnectionError(e, request=request)

ConnectionError: HTTPConnectionPool(host='en.wikipedia.org', port=80): Max retries exceeded with url: /w/api.php?list=search&srprop=&srlimit=10&limit=10&srsearch=Barack&format=json&action=query (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000023606F8AB38>: Failed to establish a new connection: [WinError 10060] 由于连接方在一段时间后没有正确答复或连接的主机没有反应,连接尝试失败。'))`

I am new to scratching data. And I did not find any answer direct for this problem. Do I need to modify any of the scripts in the tracebacks? Appreciate your help!

hawbox commented 2 years ago

Hi YixinNJU: I'm from China too and also encountered this problem.This problem is caused by the website block of Wikipedia by Chinese government. So please use proxies to access data from Wikipedia. Second is to degrade the version of urllib3 to 1.25.11 (just use pip install urllib3==1.25.11 to reinstall the module automatically) to avoid ProxyError problem if you are using proxies. The urllib3 bundled with latest Python package added support for HTTPs Proxy however it's need to make the system preferences configurate well to be working. Unfortunately I still can't figure out how I can make the preferences to work well with this module also I found an article related to the problem in Chinese for your reference (https://www.cnblogs.com/davyyy/p/14388623.html) . I need somebody to give some solutions to solve this problem. Lastly sorry for my poor English I'm trying to improve that.

jxrloveyou commented 2 years ago

@hawbox 可以修改库中的URL为‘https://en.volupedia.org/w/api.php’ 就可以调用了 不过响应比较慢。

sevenzard commented 2 years ago

Hi YixinNJU: I'm from China too and also encountered this problem.This problem is caused by the website block of Wikipedia by Chinese government. So please use proxies to access data from Wikipedia. Second is to degrade the version of urllib3 to 1.25.11 (just use pip install urllib3==1.25.11 to reinstall the module automatically) to avoid ProxyError problem if you are using proxies. The urllib3 bundled with latest Python package added support for HTTPs Proxy however it's need to make the system preferences configurate well to be working. Unfortunately I still can't figure out how I can make the preferences to work well with this module also I found an article related to the problem in Chinese for your reference (https://www.cnblogs.com/davyyy/p/14388623.html) . I need somebody to give some solutions to solve this problem. Lastly sorry for my poor English I'm trying to improve that.

Hello, I'm also from China and I meet this problem too. Thank you very much for your warm reply. I used the global Https proxy and had degraded the version of urllib3 to 1.25.11 as you said, but I still can't solve this question. Have you solved it? please.

sevenzard commented 2 years ago

Hi YixinNJU: I'm from China too and also encountered this problem.This problem is caused by the website block of Wikipedia by Chinese government. So please use proxies to access data from Wikipedia. Second is to degrade the version of urllib3 to 1.25.11 (just use pip install urllib3==1.25.11 to reinstall the module automatically) to avoid ProxyError problem if you are using proxies. The urllib3 bundled with latest Python package added support for HTTPs Proxy however it's need to make the system preferences configurate well to be working. Unfortunately I still can't figure out how I can make the preferences to work well with this module also I found an article related to the problem in Chinese for your reference (https://www.cnblogs.com/davyyy/p/14388623.html) . I need somebody to give some solutions to solve this problem. Lastly sorry for my poor English I'm trying to improve that.

Hello, I have solved this error with your help. I add the below code in the '_wiki_request' function of the wikipedia.py. Specifically, the added code is : proxies={ 'http': 'http://127.0.0.1:7890', 'https': 'http://127.0.0.1:7890' # https -> http } And what need to notice is the above code should be put before the code "r = requests.get(API_URL, params=params, headers=headers, proxies=proxies)" in this function. Thank you for your warm answer again!

catbears commented 2 years ago

Hello I'm behind a company proxy and handle the same problem via the request method at api.py in folder requests. You can add

verify=False

like here:

    with sessions.Session() as session:
        return session.request(method=method, url=url, verify=False, **kwargs)

or, what I do, is adding a certificate of the proxy, so that it let's me out (the page is not blocked, but we need to go through the proxy).

    with sessions.Session() as session:
        return session.request(method=method, url=url, , verify ='/path / to / certfile', **kwargs)

You could place the arguments at other points, but I choose this one. That way, I covered my virtual environment.

neo-dqy commented 1 year ago

Hi YixinNJU: I'm from China too and also encountered this problem.This problem is caused by the website block of Wikipedia by Chinese government. So please use proxies to access data from Wikipedia. Second is to degrade the version of urllib3 to 1.25.11 (just use pip install urllib3==1.25.11 to reinstall the module automatically) to avoid ProxyError problem if you are using proxies. The urllib3 bundled with latest Python package added support for HTTPs Proxy however it's need to make the system preferences configurate well to be working. Unfortunately I still can't figure out how I can make the preferences to work well with this module also I found an article related to the problem in Chinese for your reference (https://www.cnblogs.com/davyyy/p/14388623.html) . I need somebody to give some solutions to solve this problem. Lastly sorry for my poor English I'm trying to improve that.

Hello, I have solved this error with your help. I add the below code in the '_wiki_request' function of the wikipedia.py. Specifically, the added code is : proxies={ 'http': 'http://127.0.0.1:7890', 'https': 'http://127.0.0.1:7890' # https -> http } And what need to notice is the above code should be put before the code "r = requests.get(API_URL, params=params, headers=headers, proxies=proxies)" in this function. Thank you for your warm answer again!

您好,我按照上述操作还是无法连接到wiki的网站,我是在实验室服务器上操作的,已经全局export的http代理,同时也尝试了降级urllib3,以及修改'_wiki_request' function,但上述操作都没有起作用。求教~~

YANGCHEN205 commented 2 months ago

Hi YixinNJU: I'm from China too and also encountered this problem.This problem is caused by the website block of Wikipedia by Chinese government. So please use proxies to access data from Wikipedia. Second is to degrade the version of urllib3 to 1.25.11 (just use pip install urllib3==1.25.11 to reinstall the module automatically) to avoid ProxyError problem if you are using proxies. The urllib3 bundled with latest Python package added support for HTTPs Proxy however it's need to make the system preferences configurate well to be working. Unfortunately I still can't figure out how I can make the preferences to work well with this module also I found an article related to the problem in Chinese for your reference (https://www.cnblogs.com/davyyy/p/14388623.html) . I need somebody to give some solutions to solve this problem. Lastly sorry for my poor English I'm trying to improve that.

Hello, I have solved this error with your help. I add the below code in the '_wiki_request' function of the wikipedia.py. Specifically, the added code is : proxies={ 'http': 'http://127.0.0.1:7890', 'https': 'http://127.0.0.1:7890' # https -> http } And what need to notice is the above code should be put before the code "r = requests.get(API_URL, params=params, headers=headers, proxies=proxies)" in this function. Thank you for your warm answer again!

您好,我按照上述操作还是无法连接到wiki的网站,我是在实验室服务器上操作的,已经全局export的http代理,同时也尝试了降级urllib3,以及修改'_wiki_request' function,但上述操作都没有起作用。求教~~

你好,请问你解决了吗