Open Irenik515 opened 6 months ago
Same error
`Getting snapshot pages................../usr/lib/ruby/3.1.0/socket.rb:1214:in __connect_nonblock': Failed to open TCP connection to web.archive.org:443 (Connection refused - connect(2) for 207.241.237.3:443) (Errno::ECONNREFUSED) from /usr/lib/ruby/3.1.0/socket.rb:1214:in connect_nonblock'
from /usr/lib/ruby/3.1.0/socket.rb:56:in connect_internal' from /usr/lib/ruby/3.1.0/socket.rb:137:in connect'
from /usr/lib/ruby/3.1.0/socket.rb:642:in block in tcp' from /usr/lib/ruby/3.1.0/socket.rb:227:in each'
from /usr/lib/ruby/3.1.0/socket.rb:227:in foreach' from /usr/lib/ruby/3.1.0/socket.rb:632:in tcp'
from /usr/lib/ruby/3.1.0/net/http.rb:998:in connect' from /usr/lib/ruby/3.1.0/net/http.rb:976:in do_start'
from /usr/lib/ruby/3.1.0/net/http.rb:965:in start' from /usr/lib/ruby/3.1.0/open-uri.rb:323:in open_http'
from /usr/lib/ruby/3.1.0/open-uri.rb:741:in buffer_open' from /usr/lib/ruby/3.1.0/open-uri.rb:212:in block in open_loop'
from /usr/lib/ruby/3.1.0/open-uri.rb:210:in catch' from /usr/lib/ruby/3.1.0/open-uri.rb:210:in open_loop'
from /usr/lib/ruby/3.1.0/open-uri.rb:151:in open_uri' from /usr/lib/ruby/3.1.0/open-uri.rb:721:in open'
from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in get_raw_list_from_api' from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in block in get_all_snapshots_to_consider'
from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in times' from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in get_all_snapshots_to_consider'
from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in get_file_list_curated' from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in get_file_list_by_timestamp'
from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in file_list_by_timestamp' from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in download_files'
from /var/lib/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in <top (required)>' from /usr/local/bin/wayback_machine_downloader:25:in load'
from /usr/local/bin/wayback_machine_downloader:25:in
'`
Same error.
.C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1001:in rescue in connect': Failed to open TCP connection to web.archive.org:443 (A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. - user specified timeout) (Net::OpenTimeout) from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:997:in
connect'
from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:976:in do_start' from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:965:in
start'
from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:323:in open_http' from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:741:in
buffer_open'
from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:212:in block in open_loop' from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:210:in
catch'
from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:210:in open_loop' from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:151:in
open_uri'
from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:721:in open' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in
get_raw_list_from_api'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in block in get_all_snapshots_to_consider' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in
times'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in get_all_snapshots_to_consider' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in
get_file_list_curated'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in get_file_list_by_timestamp' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in
file_list_by_timestamp'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in download_files' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in
<top (required)>'
from C:/Ruby31-x64/bin/wayback_machine_downloader:32:in load' from C:/Ruby31-x64/bin/wayback_machine_downloader:32:in
connect_internal': A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. - user specified timeout (Errno::ETIMEDOUT) from C:/Ruby31-x64/lib/ruby/3.1.0/socket.rb:137:in
connect'
from C:/Ruby31-x64/lib/ruby/3.1.0/socket.rb:642:in block in tcp' from C:/Ruby31-x64/lib/ruby/3.1.0/socket.rb:227:in
each'
from C:/Ruby31-x64/lib/ruby/3.1.0/socket.rb:227:in foreach' from C:/Ruby31-x64/lib/ruby/3.1.0/socket.rb:632:in
tcp'
from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:998:in connect' from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:976:in
do_start'
from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:965:in start' from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:323:in
open_http'
from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:741:in buffer_open' from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:212:in
block in open_loop'
from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:210:in catch' from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:210:in
open_loop'
from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:151:in open_uri' from C:/Ruby31-x64/lib/ruby/3.1.0/open-uri.rb:721:in
open'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in get_raw_list_from_api' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in
block in get_all_snapshots_to_consider'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in times' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in
get_all_snapshots_to_consider'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in get_file_list_curated' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in
get_file_list_by_timestamp'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in file_list_by_timestamp' from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in
download_files'
from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in <top (required)>' from C:/Ruby31-x64/bin/wayback_machine_downloader:32:in
load'
from C:/Ruby31-x64/bin/wayback_machine_downloader:32:in `
I was having this problem. I fought with it for several days before getting to a combo of things that worked.
I added the sleep line that's mentioned in an earlier thread, but I had to make it 10 instead of 3. Also, I couldn't use the switch to download multiple pages at the same time. And, finally, I had to wait about 12 hours so that whatever blocks archive.org was doing was reset and allowed me to connect again.
Since doing those things, I've been pulling pages from a large site for more than 12 hours with no errors.
Looks like I've solved the problem. The solution from here worked for me:
See #280
Thankyou thankyou thankyou to everyone who's worked on this.
I finally got the downloader working again, so I'll try to explain it for non-ruby people like me who were totally lost.
There are possibly other ways to do this but I have no idea. (see my username).
My method of repair is perhaps much easier as it only requires easy file download/copy/paste:
wayback_machine_downloader
the recommended way~/.gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/
~/.gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/
wayback_machine_downloader
using gem or it will break againNotes:
My method, perhaps much easier as it only requires easy file download/copy/paste:
- install
wayback_machine_downloader
the recommended wayreplace the local versions of the following two files at
~/.gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/
with these updated copies, from #280- don't update
wayback_machine_downloader
using gem
I've followed these instruction and now i receive the following error:
Getting snapshot pagesC:/Ruby33-x64/lib/ruby/gems/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:6:in `get_raw_list_from_api': wrong number of arguments (given 3, expected 2) (ArgumentError)
I'm new to ruby as well, dunno how to processed.
Simply download and copy the files again, there's no need to edit them at all.
For the error you have posted, it seems the downloaded files have been changed or reformatted in some way.
Right click the i. and ii. links above, they are direct links to the files that need saving.
My method, perhaps much easier as it only requires easy file download/copy/paste:
- install
wayback_machine_downloader
the recommended wayreplace the local versions of the following two files at
~/.gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/
with these updated copies, from #280- don't update
wayback_machine_downloader
using gemI've followed these instruction and now i receive the following error:
Getting snapshot pagesC:/Ruby33-x64/lib/ruby/gems/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:6:in `get_raw_list_from_api': wrong number of arguments (given 3, expected 2) (ArgumentError)
I'm new to ruby as well, dunno how to processed.
can confirm having the same issue after downloading i. and ii. as of 2024-07-04
The issue is that the ii. file has to be uploaded to gems\wayback_machine_downloader-2.3.1\lib\wayback_machine_downloader\
rather than gems\wayback_machine_downloader-2.3.1\lib\
. After placing the file there it now works.
The issue is that the ii. file has to be uploaded to
gems\wayback_machine_downloader-2.3.1\lib\wayback_machine_downloader\
rather thangems\wayback_machine_downloader-2.3.1\lib\
. After placing the file there it now works.
Answere was in front of my eyes all the time. The path where "archive_api.rb" mus be replaced is in the very error message.
Getting snapshot pages ----> C:/Ruby33-x64/lib/ruby/gems/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/ <----- archive_api.rb:6:in `get_raw_list_from_api': wrong number of arguments (given 3, expected 2) (ArgumentError)
so i just went to "C:/Ruby33-x64/lib/ruby/gems/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader" and replaced the "archive_api.rb" with the one gingerbeardman provided above.
Problem solved. Thanks for the help.
Hi everyone, I'll start by saying that I'm new and that I don't speak Ruby's language but I was constantly using it to download information from now dead sites via wayback machine until a few months ago. I learned about this method through the following guide: https://pianoweb.eu/scaricare-copie-siti-scaduti-wayback-machine/.
For a few months now I haven't been able to make it work anymore and my limited knowledge doesn't help me. Obviously I tried to implement your instructions above but without success.
Can I kindly ask you to detail the steps to be taken so that I can go back to working with ruby and wayback and machine? A thousand thanks
I've clarified my instructions in this comment: https://github.com/hartator/wayback-machine-downloader/issues/291#issuecomment-2127830210
It involves downloading and replacing two files, very easy. If you get stuck you need to tell us where you are getting stuck or we cannot help. But at this point it is as easy as can be @Bissnet
@gingerbeardman I installed Ruby+Devkit 3.3.3-1 (x64) from the site shown in the image. I then tried to replace your files but I can't find the last folder you indicate... I only have this situation C:\Ruby33-x64\lib\ruby\gems\3.3.0\gems....
If you haven't got the two files to replace then you haven't installed wayback-machine-downloader!
@gingerbeardman
Thank you. Now I have followed all the steps but it still doesn't work as shown in the image below...
@Bissnet did your system ask you to confirm that you were replacing/overwriting each of the two files?
it seems you have them two copies of each file side-by-side? you should replace the existing files
@gingerbeardman I followed all the steps (overwriting the files) and now I have single files but it still doesn't work.
@Bissnet these still seem to be the errors for the old files?
new files should be as follows:
I just tried it as follows
% wayback_machine_downloader www.photonsolar.be
Downloading www.photonsolar.be to websites/www.photonsolar.be/ from Wayback Machine archives.
Getting snapshot pages..................................................................................................... found 88999 snaphots to consider.
508 files to download:
https://www.photonsolar.be/img/favicon.ico?1643288163 # websites/www.photonsolar.be/img/favicon.ico?1643288163 already exists. (1/508)
...
http://www.photonsolar.be:80/ondul.html -> websites/www.photonsolar.be/ondul.html (508/508)
Download completed in 740.43s, saved in websites/www.photonsolar.be/ (508 files)
sorry I can't really be of more help here
Ruby works....thank's @gingerbeardman! It's slow to start and sometimes it doesn't start at first try, but after trying again it seems to work for the moment!
Now, I would like to move the download folder to the desktop to avoid problems with files with long names. I read the description "-d, --directory PATH Directory to save the downloaded files into Default is ./websites/ plus the domain name" but it didn't work and in fact I lost the previous destination folder: 1) how should I write the string? 2) do I have to write it every time, or once set do I go back to writing the classic string to be able to download the material?
My method of repair is perhaps much easier as it only requires easy file download/copy/paste:
- follow the Installation instructions in the README
- install
wayback_machine_downloader
the recommended wayreplace the local versions of the following two files with updated copies from #280 right click the links below and save as, then replace the existing files at the paths shown
- wayback_machine_downloader.rb at
~/.gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/
- archive_api.rb at
~/.gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/
- don't update
wayback_machine_downloader
using gem or it will break againNotes:
- the version of ruby you have installed might be 3.3.0, or 3.3.1, etc.
- these instructions are for macOS, you'll need to adapt for Linux/Windows
I receive the same issue after 10min of parsing. Can I increase time?
IA are throttling indiscriminately. Including sue of their own tool.
The only thing you can do is retry, already downloaded files should be skipped.
Or you can add delays into the script but this will mean the download process takes much much longer
IA are throttling indiscriminately. Including sue of their own tool.
The only thing you can do is retry, already downloaded files should be skipped.
Or you can add delays into the script but this will mean the download process takes much much longer
Anything was downloaded. But can you provide, where in code I can increase delays? thanks
Help me please. I looked through it but I can’t understand anything. I downloaded it two months ago, everything worked. Now after the request it shows the following error: What should I do to download the site? I've been trying for a week now.
Getting snapshot pages.C:/Ruby33-x64/lib/ruby/3.3.0/net/http.rb:1603:in'
initialize': Failed to open TCP connection to web.archive.org:443 (No connection could be made because the target machine actively refused it. - connect(2) for "web.archive.org" port 443) (Errno::ECONNREFUSED) from C:/Ruby33-x64/lib/ruby/3.3.0/net/http.rb:1603:in
open' from C:/Ruby33-x64/lib/ruby/3.3.0/net/http.rb:1603:inblock in connect' from C:/Ruby33-x64/lib/ruby/3.3.0/timeout.rb:186:in
block in timeout' from C:/Ruby33-x64/lib/ruby/3.3.0/timeout.rb:193:intimeout' from C:/Ruby33-x64/lib/ruby/3.3.0/net/http.rb:1601:in
connect' from C:/Ruby33-x64/lib/ruby/3.3.0/net/http.rb:1580:indo_start' from C:/Ruby33-x64/lib/ruby/3.3.0/net/http.rb:1569:in
start' from C:/Ruby33-x64/lib/ruby/3.3.0/open-uri.rb:334:inopen_http' from C:/Ruby33-x64/lib/ruby/3.3.0/open-uri.rb:770:in
buffer_open' from C:/Ruby33-x64/lib/ruby/3.3.0/open-uri.rb:220:inblock in open_loop' from C:/Ruby33-x64/lib/ruby/3.3.0/open-uri.rb:218:in
catch' from C:/Ruby33-x64/lib/ruby/3.3.0/open-uri.rb:218:inopen_loop' from C:/Ruby33-x64/lib/ruby/3.3.0/open-uri.rb:158:in
open_uri' from C:/Ruby33-x64/lib/ruby/3.3.0/open-uri.rb:750:inopen' from C:/Users/User/.local/share/gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in
get_raw_list_from_api' from C:/Users/User/.local/share/gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:inblock in get_all_snapshots_to_consider' from <internal:numeric>:237:in
times' from C:/Users/User/.local/share/gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:inget_all_snapshots_to_consider' from C:/Users/User/.local/share/gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in
get_file_list_curated' from C:/Users/User/.local/share/gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:inget_file_list_by_timestamp' from C:/Users/User/.local/share/gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in
file_list_by_timestamp' from C:/Users/User/.local/share/gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:indownload_files' from C:/Users/User/.local/share/gem/ruby/3.3.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in
<top (required)>' from C:/Ruby33-x64/bin/wayback_machine_downloader:32:inload' from C:/Ruby33-x64/bin/wayback_machine_downloader:32:in