palewire / savepagenow

A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service
https://palewi.re/docs/savepagenow/
MIT License
167 stars 23 forks source link

cli always returns errors for newly archived pages #36

Closed SuperSandro2000 closed 1 year ago

SuperSandro2000 commented 3 years ago

When I archive a page for the first time I encounter the following error:

Traceback (most recent call last):                                                                                                                                                            File "/nix/store/1k494cir39cjbbz01d62qlc5rxhf9x9y-savepagenow-1.1.1/bin/.savepagenow-wrapped", line 9, in <module>                                                                            sys.exit(cli())                                                                                                                                                                           File "/nix/store/aph3vafycfbj9325vgcm7y1zlx5v88j6-python3.8-click-7.1.2/lib/python3.8/site-packages/click/core.py", line 829, in __call__                                                     return self.main(*args, **kwargs)
  File "/nix/store/aph3vafycfbj9325vgcm7y1zlx5v88j6-python3.8-click-7.1.2/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/nix/store/aph3vafycfbj9325vgcm7y1zlx5v88j6-python3.8-click-7.1.2/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/nix/store/aph3vafycfbj9325vgcm7y1zlx5v88j6-python3.8-click-7.1.2/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/nix/store/1k494cir39cjbbz01d62qlc5rxhf9x9y-savepagenow-1.1.1/lib/python3.8/site-packages/savepagenow/api.py", line 121, in cli
    archive_url = capture(url, **kwargs)
  File "/nix/store/1k494cir39cjbbz01d62qlc5rxhf9x9y-savepagenow-1.1.1/lib/python3.8/site-packages/savepagenow/api.py", line 39, in capture
    response = requests.get(request_url, headers=headers)
  File "/nix/store/bl3w59fnzcmb76f2r9q8nc8wk6jnycfa-python3.8-requests-2.25.0/lib/python3.8/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/nix/store/bl3w59fnzcmb76f2r9q8nc8wk6jnycfa-python3.8-requests-2.25.0/lib/python3.8/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/nix/store/bl3w59fnzcmb76f2r9q8nc8wk6jnycfa-python3.8-requests-2.25.0/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/nix/store/bl3w59fnzcmb76f2r9q8nc8wk6jnycfa-python3.8-requests-2.25.0/lib/python3.8/site-packages/requests/sessions.py", line 677, in send
    history = [resp for resp in gen]
  File "/nix/store/bl3w59fnzcmb76f2r9q8nc8wk6jnycfa-python3.8-requests-2.25.0/lib/python3.8/site-packages/requests/sessions.py", line 677, in <listcomp>
    history = [resp for resp in gen]
  File "/nix/store/bl3w59fnzcmb76f2r9q8nc8wk6jnycfa-python3.8-requests-2.25.0/lib/python3.8/site-packages/requests/sessions.py", line 166, in resolve_redirects
    raise TooManyRedirects('Exceeded {} redirects.'.format(self.max_redirects), response=resp)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.

I tried it with https://github.com/NixOS/nixpkgs/pull/113143

P3lUZa commented 3 years ago

I'm working with a list of links and eventually I get the same problem: raise TooManyRedirects('Exceeded {} redirects.'.format(self.max_redirects), response=resp) requests.exceptions.TooManyRedirects: Exceeded 30 redirects.

Obviusly we can solve it with: try: and except TooManyRedirects:

palewire commented 2 years ago

Weird. Is it possible to recreate this?

P3lUZa commented 2 years ago

Not anymore for the application I used it (backup Yahoo answers links) but my solution was to put:

    try:
        k = savepagenow.capture_or_cache(y_url)
        print(k[0])
        r = k[0]
        wayback.write(r + '\n')
        break

    # This will handle wayback errors:
    except WaybackRuntimeError as error:
        print(error)
        wayback_errors.write(line)
        wayback_errors.write('\n')
        print("error in: " + line + '\n')
        break
    except ConnectionError:
        wayback_errors.write("connection error in: " + line + '\n')
        print("conection error in: " + line + '\n')
        x = x + 1
    except TooManyRedirects:
        wayback_errors.write("too_many_redirects error in: " + line + '\n')
        print("too_many_red error in: " + line + '\n')
        x = x + 1

Here is my code: https://github.com/P3lUZa/Yahoo-answers-to-Wayback-Machine/blob/main/main.py

palewire commented 1 year ago

I'm going to close this as stale. If you are still having issues please speak up.