palewire / savepagenow

A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service
https://palewi.re/docs/savepagenow/
MIT License
167 stars 23 forks source link

sun.security.validator.ValidatorException: PKIX path building failed #21

Closed yookoala closed 3 years ago

yookoala commented 4 years ago

Got this error when saving a specific page:

Traceback (most recent call last):
  File "/home/user/my-project/.venv/bin/savepagenow", line 10, in <module>
    sys.exit(cli())
  File "/home/user/my-project/.venv/lib64/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/my-project/.venv/lib64/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/user/my-project/.venv/lib64/python3.7/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/my-project/.venv/lib64/python3.7/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/user/my-project/.venv/lib64/python3.7/site-packages/savepagenow/api.py", line 127, in cli
    archive_url = capture(url, **kwargs)
  File "/home/user/my-project/.venv/lib64/python3.7/site-packages/savepagenow/api.py", line 42, in capture
    raise WaybackRuntimeError(error_header)
LiveDocumentNotAvailableException: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable

The code I'm using:

import savepagenow

archive_url, captured = savepagenow.capture_or_cache("https://www.chp.gov.hk/files/pdf/local_situation_covid19_tc.pdf")
print(archive_url)

The same code works for some other URL normally.

I presume this is a server side error because I don't think this library is calling Java on my workstation. Strangely, if I post on the Save Page Now form, it actually creates a snapshot of the URL. What is going on?

The URL I'm trying to archive: https://www.chp.gov.hk/files/pdf/local_situation_covid19_tc.pdf

palewire commented 4 years ago

To me this reads like some kind of SSL issue with your web request from the script.

yookoala commented 4 years ago

Is there any option to circumvent the SSL error from blocking the program?