miyakogi / pyppeteer

Headless chrome/chromium automation library (unofficial port of puppeteer)
Other
3.56k stars 372 forks source link

SSL error when downloading chromium #258

Open fjdksl546 opened 4 years ago

fjdksl546 commented 4 years ago

Hi, I have this problem. I use ubuntu server that is provided from AWS EC2 and in this sever, I want to execute program to crawl from the e website using javascript.

What should I do for this error?

[W:pyppeteer.chromium_downloader] start chromium download. Download may take a few minutes. Traceback (most recent call last): File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py", line 485, in wrap_socket cnx.do_handshake() File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1808, in do_handshake self._raise_ssl_error(self._ssl, result) File "/usr/lib/python3/dist-packages/OpenSSL/SSL.py", line 1548, in _raise_ssl_error _raise_current_error() File "/usr/lib/python3/dist-packages/OpenSSL/_util.py", line 54, in exception_from_error_queue raise exception_type(errors) OpenSSL.SSL.Error: [('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 672, in urlopen chunked=chunked, File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 376, in _make_request self._validate_conn(conn) File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 994, in _validate_conn conn.connect() File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connection.py", line 394, in connect sslcontext=context, File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/util/ssl.py", line 370, in ssl_wrap_socket return context.wrap_socket(sock, server_hostname=server_hostname) File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py", line 491, in wrap_socket raise ssl.SSLError("bad handshake: %r" % e) ssl.SSLError: ("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "chatbot_dormitary_menu2_crawl.py", line 10, in resp.html.render() File "/home/ubuntu/.local/lib/python3.6/site-packages/requests_html.py", line 586, in render self.browser = self.session.browser # Automatically create a event loop and browser File "/home/ubuntu/.local/lib/python3.6/site-packages/requests_html.py", line 730, in browser self._browser = self.loop.run_until_complete(super().browser) File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete return future.result() File "/home/ubuntu/.local/lib/python3.6/site-packages/requests_html.py", line 714, in browser self._browser = await pyppeteer.launch(ignoreHTTPSErrors=not(self.verify), headless=True, args=self.__browser_args) File "/home/ubuntu/.local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 311, in launch return await Launcher(options, kwargs).launch() File "/home/ubuntu/.local/lib/python3.6/site-packages/pyppeteer/launcher.py", line 125, in init download_chromium() File "/home/ubuntu/.local/lib/python3.6/site-packages/pyppeteer/chromium_downloader.py", line 136, in download_chromium extract_zip(download_zip(get_url()), DOWNLOADS_FOLDER / REVISION) File "/home/ubuntu/.local/lib/python3.6/site-packages/pyppeteer/chromium_downloader.py", line 78, in download_zip data = http.request('GET', url, preload_content=False) File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/request.py", line 76, in request method, url, fields=fields, headers=headers, urlopen_kw File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/request.py", line 97, in request_encode_url return self.urlopen(method, url, extra_kw) File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/poolmanager.py", line 330, in urlopen response = conn.urlopen(method, u.request_uri, kw) File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 762, in urlopen response_kw File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 762, in urlopen response_kw File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 762, in urlopen **response_kw File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 720, in urlopen method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2] File "/home/ubuntu/.local/lib/python3.6/site-packages/urllib3/util/retry.py", line 436, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /chromium-browser-snapshots/Linux_x64/575458/chrome-linux.zip (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",),))

enquora commented 4 years ago

See https://github.com/miyakogi/pyppeteer/issues/219 pip install -U "urllib3<1.25"

This problem and fix should be highlighted in installation instructions.

kiwi0fruit commented 4 years ago

I think it's a bad idea to use unsecure download. I recommend to try:

pip install pyppdf

then in the code first use:

import pyppdf.patch_pyppeteer

that would apply the patch and nothing more is required to do (see this for details).

k-osi commented 4 years ago

thanks @kiwi0fruit

m-ocean-it commented 4 years ago

Solved it for me. Thanks!

dnagl commented 4 years ago

thanks @kiwi0fruit it worked for me ;)