Open sscvbnm opened 6 years ago
Experiencing a similar error.
Downloading http://www.darkpersonalities.com to websites/www.darkpersonalities.com/ from Wayback Machine archives.
Getting snapshot pages/usr/lib/ruby/2.4.0/open-uri.rb:363:in `open_http': 403 Forbidden (OpenURI::HTTPError)
from /usr/lib/ruby/2.4.0/open-uri.rb:741:in `buffer_open'
from /usr/lib/ruby/2.4.0/open-uri.rb:212:in `block in open_loop'
from /usr/lib/ruby/2.4.0/open-uri.rb:210:in `catch'
from /usr/lib/ruby/2.4.0/open-uri.rb:210:in `open_loop'
from /usr/lib/ruby/2.4.0/open-uri.rb:151:in `open_uri'
from /usr/lib/ruby/2.4.0/open-uri.rb:721:in `open'
from /usr/lib/ruby/2.4.0/open-uri.rb:35:in `open'
from /home/ndevereaux/.gem/ruby/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader/archive_api.rb:8:in `get_raw_list_from_api'
from /home/ndevereaux/.gem/ruby/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:87:in `get_all_snapshots_to_consider'
from /home/ndevereaux/.gem/ruby/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:104:in `get_file_list_curated'
from /home/ndevereaux/.gem/ruby/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:131:in `get_file_list_by_timestamp'
from /home/ndevereaux/.gem/ruby/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:270:in `file_list_by_timestamp'
from /home/ndevereaux/.gem/ruby/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:154:in `download_files'
from /home/ndevereaux/.gem/ruby/2.4.0/gems/wayback_machine_downloader-2.1.1/bin/wayback_machine_downloader:68:in `<top (required)>'
from /home/ndevereaux/.gem/ruby/2.4.0/bin/wayback_machine_downloader:23:in `load'
from /home/ndevereaux/.gem/ruby/2.4.0/bin/wayback_machine_downloader:23:in `<main>'
Ditto.
Getting snapshot pages/usr/lib/ruby/2.3.0/open-uri.rb:359:in `open_http': 403 Forbidden (OpenURI::HTTPError)
from /usr/lib/ruby/2.3.0/open-uri.rb:737:in `buffer_open'
from /usr/lib/ruby/2.3.0/open-uri.rb:212:in `block in open_loop'
from /usr/lib/ruby/2.3.0/open-uri.rb:210:in `catch'
from /usr/lib/ruby/2.3.0/open-uri.rb:210:in `open_loop'
from /usr/lib/ruby/2.3.0/open-uri.rb:151:in `open_uri'
from /usr/lib/ruby/2.3.0/open-uri.rb:717:in `open'
from /usr/lib/ruby/2.3.0/open-uri.rb:35:in `open'
from /var/lib/gems/2.3.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader/archive_api.rb:8:in `get_raw_list_from_api'
from /var/lib/gems/2.3.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:87:in `get_all_snapshots_to_consider'
from /var/lib/gems/2.3.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:104:in `get_file_list_curated'
from /var/lib/gems/2.3.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:131:in `get_file_list_by_timestamp'
from /var/lib/gems/2.3.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:141:in `list_files'
from /var/lib/gems/2.3.0/gems/wayback_machine_downloader-2.1.1/bin/wayback_machine_downloader:66:in `<top (required)>'
from /usr/local/bin/wayback_machine_downloader:22:in `load'
from /usr/local/bin/wayback_machine_downloader:22:in `<main>'
I'm experiencing the same issue for one site, I've successfully downloaded others with no issue.
Getting snapshot pagesC:/Ruby22-x64/lib/ruby/2.2.0/open-uri.rb:358:in open_http': 403 Forbidden (OpenURI::HTTPError) from C:/Ruby22-x64/lib/ruby/2.2.0/open-uri.rb:736:in
buffer_open'
from C:/Ruby22-x64/lib/ruby/2.2.0/open-uri.rb:211:in block in open_loop' from C:/Ruby22-x64/lib/ruby/2.2.0/open-uri.rb:209:in
catch'
from C:/Ruby22-x64/lib/ruby/2.2.0/open-uri.rb:209:in open_loop' from C:/Ruby22-x64/lib/ruby/2.2.0/open-uri.rb:150:in
open_uri'
from C:/Ruby22-x64/lib/ruby/2.2.0/open-uri.rb:716:in open' from C:/Ruby22-x64/lib/ruby/2.2.0/open-uri.rb:34:in
open'
from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader/archive_api.rb:8:in get_raw_list_from_api' from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:87:in
get_all_snapshots_to_consider'
from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:104:in get_file_list_curated' from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:131:in
get_file_list_by_timestamp'
from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:141:in list_files' from C:/Ruby22-x64/lib/ruby/gems/2.2.0/gems/wayback_machine_downloader-2.1.1/bin/wayback_machine_downloader:66:in
<top (required)>'
from C:/Ruby22-x64/bin/wayback_machine_downloader:23:in load' from C:/Ruby22-x64/bin/wayback_machine_downloader:23:in
I got same error for particular sites.
from C:/Ruby24/lib/ruby/2.4.0/open-uri.rb:741:in `buffer_open'
from C:/Ruby24/lib/ruby/2.4.0/open-uri.rb:212:in `block in open_loop'
from C:/Ruby24/lib/ruby/2.4.0/open-uri.rb:210:in `catch'
from C:/Ruby24/lib/ruby/2.4.0/open-uri.rb:210:in `open_loop'
from C:/Ruby24/lib/ruby/2.4.0/open-uri.rb:151:in `open_uri'
from C:/Ruby24/lib/ruby/2.4.0/open-uri.rb:721:in `open'
from C:/Ruby24/lib/ruby/2.4.0/open-uri.rb:35:in `open'
from C:/Ruby24/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader/archive_api.rb:8:in `get_raw_list_from_api'
from C:/Ruby24/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:87:in `get_all_snapshots_to_consider'
from C:/Ruby24/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:104:in `get_file_list_curated'
from C:/Ruby24/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:131:in `get_file_list_by_timestamp'
from C:/Ruby24/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:270:in `file_list_by_timestamp'
from C:/Ruby24/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:154:in `download_files'
from C:/Ruby24/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/bin/wayback_machine_downloader:68:in `<top (required)>'
from C:/Ruby24/bin/wayback_machine_downloader:23:in `load'
from C:/Ruby24/bin/wayback_machine_downloader:23:in `<main>'
@hartator is this a Ruby version issue? I tried with 2.3.1 (YARV/MRI) and it did not work, same error. I'll try w/an older version of Ruby as well today.
Okay, this has nothing to do with the gem or Ruby versions, and has everything to do with the web archive itself. This thread appears relevant:
https://archive.org/post/406632/why-does-the-wayback-machine-pay-attention-to-robotstxt
Ok, so what we can do now, how to solve this issue??
I am getting "Getting snapshot pages/System/Library/Frameworks/Ruby.framework/Versions/2.0/usr/lib/ruby/2.0.0/open-uri.rb:353:in `open_http': 403 Forbidden (OpenURI::HTTPError)" as well for some sites but not all. Anyone find a work around?
Hi.. I want to download this site: https://web.archive.org/web/20071213053236/http://www.qquran.com:80/qu.php?goto=main
I get this error every time I run the request also I changed the parameters and main url typing with the same problem:
`wayback_machine_downloader http://www.qquran.com:80/qu.php?goto=main -f 20071213053236 Downloading http://www.qquran.com:80/qu.php?goto=main to websites/www.qquran.com:80/ from Wayback Machine archives.
Getting snapshot pagesC:/Ruby24-x64/lib/ruby/2.4.0/open-uri.rb:363:in
open_http': 403 Forbidden (OpenURI::HTTPError) from C:/Ruby24-x64/lib/ruby/2.4.0/open-uri.rb:741:in
buffer_open' from C:/Ruby24-x64/lib/ruby/2.4.0/open-uri.rb:212:inblock in open_loop' from C:/Ruby24-x64/lib/ruby/2.4.0/open-uri.rb:210:in
catch' from C:/Ruby24-x64/lib/ruby/2.4.0/open-uri.rb:210:inopen_loop' from C:/Ruby24-x64/lib/ruby/2.4.0/open-uri.rb:151:in
open_uri' from C:/Ruby24-x64/lib/ruby/2.4.0/open-uri.rb:721:inopen' from C:/Ruby24-x64/lib/ruby/2.4.0/open-uri.rb:35:in
open' from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader/archive_api.rb:8:inget_raw_list_from_api' from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:87:in
get_all_snapshots_to_consider' from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:104:inget_file_list_curated' from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:131:in
get_file_list_by_timestamp' from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:270:infile_list_by_timestamp' from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/lib/wayback_machine_downloader.rb:154:in
download_files' from C:/Ruby24-x64/lib/ruby/gems/2.4.0/gems/wayback_machine_downloader-2.1.1/bin/wayback_machine_downloader:68:in<top (required)>' from C:/Ruby24-x64/bin/wayback_machine_downloader:22:in
load' from C:/Ruby24-x64/bin/wayback_machine_downloader:22:in<main>'
I tried downloading two other sites with NO problem.