jsvine / waybackpack

Download the entire Wayback Machine archive for a given URL.
MIT License
2.8k stars 189 forks source link

INFO:waybackpack.session: HTTP status code: 302 #42

Open vzro opened 3 years ago

vzro commented 3 years ago

INFO:waybackpack.pack: Fetching acasaredonda.com.br @ 20200812061150 INFO:waybackpack.session: HTTP status code: 302 INFO:waybackpack.pack: Writing to /home/vz/acr_wbm_snapshot/20200812061150/acasaredonda.com.br/index.html

Only recreates 0-bytes index.html files in directories for each snapshot and returns the 302 HTTP STATUS CODE.

jsvine commented 3 years ago

Hello. Try using the --follow-redirects command-line option. Does that resolve your issue? (For all options, see this project’s README.md and/or run waybackpack -h.)

BEaXt7f97 commented 3 years ago

Hi,

I might add a disclaimer that I am not an experienced programmer etc, but I do my best to learn. I am running Python 3.9 and trying to setup but experiencing the follow issue:

python@3.9/bin/python3.9: can't find 'main' module in '/Users XXXXXXXXX

Any ideas?

rajat-np commented 3 years ago

I had a similar issue. Adding --follow-directs option solves this issue.

rajat-np commented 3 years ago

@BEaXt7f97 Can you provide steps to reproduce this error? Did you make any changes to the source code?

jwilk commented 1 year ago

This works:

waybackpack http://acasaredonda.com.br/ -d ...

But this (note the lack of slash after the domain name) causes spurious redirects:

waybackpack http://acasaredonda.com.br -d ...

From the browser's perspective these two URLs are equivalent, so they should both work the same way in waybackpack too.

jsvine commented 1 year ago

Thanks for flagging, @jwilk — very interesting. It seems that from the perspective of the Wayback Machine, these are different resources. A bit frustrating that they don't do any internal resolution. But I'd be wary of blindly choosing one URL over the other. I'd be interested for your perspective on ways to handle this, as well as the perspectives of anyone else with deep experience/familiarity with the Wayback Machine's logic.