PDF build: SSL errors, 'certificate verify failed'

drscotthawley commented 4 years ago

Hi. I'm just working through the documentation. Followed all the installation instructions. Trying to build the example mini_book as a PDF on Ubuntu:

(qe-mini-example) ~/quantecon-mini-example$ jupyter-book build ./mini_book --builder pdfhtml
Running Sphinx v2.4.4
loading pickled environment... done
building [mo]: targets for 0 po files that are out of date
building [singlehtml]: all documents
updating environment: 0 added, 0 changed, 0 removed
looking for now-outdated files... none found
preparing documents... done
assembling single document... docs/about_py docs/getting_started docs/python_by_example docs/learn_more done
writing... done
writing additional files... done
copying images... [100%] _static/lecture_specific/about_py/career_vf.png                                                                      
copying static files... ... done
copying extra files... done
dumping object inventory... done
build succeeded.

The HTML page is in mini_book/_build/html.
Finished generating HTML for book...
Converting book HTML into PDF...
[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.
Traceback (most recent call last):
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 488, in wrap_socket
    cnx.do_handshake()
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1934, in do_handshake
    self._raise_ssl_error(self._ssl, result)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1671, in _raise_ssl_error
    _raise_current_error()
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/OpenSSL/_util.py", line 54, in exception_from_error_queue
    raise exception_type(errors)
OpenSSL.SSL.Error: [('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/connectionpool.py", line 677, in urlopen
    chunked=chunked,
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/connectionpool.py", line 381, in _make_request
    self._validate_conn(conn)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/connectionpool.py", line 976, in _validate_conn
    conn.connect()
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/connection.py", line 370, in connect
    ssl_context=context,
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/util/ssl_.py", line 377, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 494, in wrap_socket
    raise ssl.SSLError("bad handshake: %r" % e)
ssl.SSLError: ("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])",)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/shawley/anaconda3/envs/qe-mini-example/bin/jupyter-book", line 8, in <module>
    sys.exit(main())
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/jupyter_book/commands/__init__.py", line 163, in build
    html_to_pdf(OUTPUT_PATH.joinpath("index.html"), path_pdf_output)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/jupyter_book/pdf.py", line 19, in html_to_pdf
    asyncio.get_event_loop().run_until_complete(_html_to_pdf(html_file, pdf_file))
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/asyncio/base_events.py", line 583, in run_until_complete
    return future.result()
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/jupyter_book/pdf.py", line 31, in _html_to_pdf
    browser = await launch(args=["--no-sandbox"])
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/pyppeteer/launcher.py", line 305, in launch
    return await Launcher(options, **kwargs).launch()
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/pyppeteer/launcher.py", line 119, in __init__
    download_chromium()
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/pyppeteer/chromium_downloader.py", line 146, in download_chromium
    extract_zip(download_zip(get_url()), DOWNLOADS_FOLDER / REVISION)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/pyppeteer/chromium_downloader.py", line 85, in download_zip
    data = http.request('GET', url, preload_content=False)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/request.py", line 76, in request
    method, url, fields=fields, headers=headers, **urlopen_kw
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/request.py", line 97, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/poolmanager.py", line 336, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/connectionpool.py", line 765, in urlopen
    **response_kw
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/connectionpool.py", line 765, in urlopen
    **response_kw
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/connectionpool.py", line 765, in urlopen
    **response_kw
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/connectionpool.py", line 725, in urlopen
    method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
  File "/home/shawley/anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/urllib3/util/retry.py", line 439, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /chromium-browser-snapshots/Linux_x64/588429/chrome-linux.zip (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')])")))

choldgraf commented 4 years ago

hmmm - it seems from the error message that the problem here is during the chromium install: [W:pyppeteer.chromium_downloader] start chromium download.

Searching around for it led me to this issue, with a fix that recommends downloading pyppdf: https://github.com/miyakogi/pyppeteer/issues/258#issuecomment-563075764

So I wonder if we should be recommending that approach in the docs instead? https://pypi.org/project/pyppdf/

drscotthawley commented 4 years ago

Thanks for the fast reply! I just tried pip install pyppdf and re-ran the build, but got the same error.
Where should the import pyppdf.patch_pyppeteer be added?

choldgraf commented 4 years ago

I got the impression that it would only need to be run once, but maybe I'm wrong? I can try running down the bug over the weekend

drscotthawley commented 4 years ago

I just now inserted it near the top of anaconda3/envs/qe-mini-example/lib/python3.7/site-packages/jupyter_book/commands/__init__.py, re-ran the build, and everything worked!

(probably not where you want to put it, but I'm brand-new to this project.)

choldgraf commented 4 years ago

wow nice! can you try it again without having the patch code at the top, and seeing if that works? I think downloading chromium only needs to happen once, so maybe we can add this as a one-off script

drscotthawley commented 4 years ago

Ok, I deleted the final .pdf just in case the script checks for that before building. Then I commented out new import statement, re-ran the build script, ...and got the original error again.

choldgraf commented 4 years ago

OK that is helpful to know, thanks for reporting back!

DavidPowell commented 4 years ago

I have experienced this same error, with jupyter-book 0.7.0b2, under windows subsystem for linux. Since the problem seems to be related to the installation of chrome, I tried running the command pyppeteer-install, which seemed to run fine.

Now I no longer see the certificate error. Instead I see the output now freezes at Converting book HTML into PDF..., apparently indefinitely.

DavidPowell commented 4 years ago

Today I upgrade jupyter-book to 0.7.0b4. Now I get a new error http.client.BadStatusLine: GET /json/version HTTP/1.1. So the saga continues!

choldgraf commented 4 years ago

yeah, headless chrome printing is definitely an unstable feature here :-/ I'm not sure if it is just the nature of running headless chrome (or maybe specifically on WSL, though I've done this and haven't had a problem) or if we just haven't figured out the right installation pathway to do it.

I wonder if there's a conda-forge recipe for pyppetteer...

drscotthawley commented 4 years ago

Confirming that (in a new environment, without editing the python source) as @DavidPowell suggests, running pyppeteer-install fixed things.

...although the resulting PDF doesn't have any TOC, book title, logo image, etc. It just starts immediately with a rendering of the first Markdown file. :-(

This is for jupyter-book 0.7.0 which pip says is the latest available. Haven't tried pulling jupyter-book from github.

choldgraf commented 4 years ago

hmmm - well for the HTML PDF output, I think that it strips away much of the "interactive" parts of the book (e.g. the left navbar). the right TOC should still make it in though. Can you post a screenshot of the PDF?

palmoreck commented 3 years ago

@drscotthawley thank you for your messages. I built a docker image with some instructions I found here and in other sites:

https://github.com/palmoreck/dockerfiles-for-binder/tree/jupyterlab_optimizacion

And here's an example of a book that I've been writing with this amazing project of executablebooks/jupyter-book 😃

https://github.com/ITAM-DS/analisis-numerico-computo-cientifico#para-convertir-notas-a-pdf

Instructions are in spanish ... but basically I'm using binder to build the book because I just have the first chapter and it's size is approx 6.5 MB (the book will have 4 chapters approx). After clicking the binder button I execute in a terminal:

bash
cd analisis-numerico-computo-cientifico/libro_optimizacion/temas/
jb build . --builder pdfhtml

After some time in directory: analisis-numerico-computo-cientifico/libro_optimizacion/temas/_build/pdf will be the pdf.

Hope this will be useful to someone.

Cheers 👋

dafriedman97 commented 3 years ago

After adding import pyppdf.patch_pyppeteer to the file @drscotthawley suggested and running pip install pyppdf I got the same original error and after adding pyppeteer-install I get around that error but get a new one. Copying it below. Any idea what I should change? Thanks!

Traceback (most recent call last):
  File "/opt/anaconda3/bin/jb", line 8, in <module>
    sys.exit(main())
  File "/opt/anaconda3/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/opt/anaconda3/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/anaconda3/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/anaconda3/lib/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/opt/anaconda3/lib/python3.7/site-packages/jupyter_book/commands/__init__.py", line 260, in build
    result, builder, OUTPUT_PATH, build_type, PAGE_NAME, click.echo
  File "/opt/anaconda3/lib/python3.7/site-packages/jupyter_book/commands/__init__.py", line 535, in builder_specific_actions
    html_to_pdf(output_path.joinpath("index.html"), path_pdf_output)
  File "/opt/anaconda3/lib/python3.7/site-packages/jupyter_book/pdf.py", line 20, in html_to_pdf
    asyncio.get_event_loop().run_until_complete(_html_to_pdf(html_file, pdf_file))
  File "/opt/anaconda3/lib/python3.7/asyncio/base_events.py", line 583, in run_until_complete
    return future.result()
  File "/opt/anaconda3/lib/python3.7/site-packages/jupyter_book/pdf.py", line 39, in _html_to_pdf
    await page.goto(f"file:///{html_file}", {"waitUntil": ["networkidle0"]})
  File "/opt/anaconda3/lib/python3.7/site-packages/pyppeteer/page.py", line 885, in goto
    raise error
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.

palmoreck commented 3 years ago

@dafriedman97 Maybe try this:

https://github.com/palmoreck/dockerfiles-for-binder/tree/jupyterlab_optimizacion#changes-for-palmoreckjupyterlab_optimizacion_binder_test_for_pdf214

I solved sth related to that one manually changing pyppeteer/page.py where is 30000 #milliseconds to self._defaultNavigationTimeout = 30000000000 # milliseconds

dafriedman97 commented 3 years ago

That worked! Thanks so much. (For me, the file was at /opt/anaconda3/lib/python3.7/site-packages/pyppeteer/page.py). Still no images in the PDF output though :(

paddyroddy commented 3 years ago

@dafriedman97 Maybe try this:

https://github.com/palmoreck/dockerfiles-for-binder/tree/jupyterlab_optimizacion#changes-for-palmoreckjupyterlab_optimizacion_binder_test_for_pdf214

I solved sth related to that one manually changing pyppeteer/page.py where is 30000 #milliseconds to self._defaultNavigationTimeout = 30000000000 # milliseconds

My book is very long in PDF form and this increasing the timeout did the trick. Anyone wanting to do this in an automated build could do the following before building the PDF

sed -i "s/self._defaultNavigationTimeout = 30000/\
self._defaultNavigationTimeout = 30000000000/" $(python -c \
"import pip; print(pip.__path__[0].rstrip('/pip'))")/pyppeteer/page.py

executablebooks / jupyter-book

PDF build: SSL errors, 'certificate verify failed' #593