Closed SGudbrandsson closed 8 years ago
That's a bit odd, it should actually rewrite the duplicate file into a directory and name it index.html
Do you sharing with me the problematic website? This way I can try to replicate on my own computer.
My email: replace_by_github_username@gmail.com
Do you mind sending me the backtrace non-redacted as well?
Just copy/paste without any edition, I am not able to reproduce.
http://REDACTED.com/uncategorized/this-weeks-goals/ # websites/ REDACTED.com/uncategorized/this-weeks-goals/index.html already exists. (4048/48177) http://www.REDACTED.com/79/affiliate-internet-marketing-campaign-kicks-off-great-bonuses/
REDACTED.com/79/affiliate-internet-marketing-campaign-kicks-off-great-bonuses/index.html already exists. (4049/48177) http://www.REDACTED.com/78/at-last-a-bloggers-path-to-making-internet-marketing-money/
REDACTED.com/78/at-last-a-bloggers-path-to-making-internet-marketing-money/index.html already exists. (4050/48177) http://www.REDACTED.com/72/ewen-chia-the-internet-marketing-and-affiliate-marketing-guru/
REDACTED.com/72/ewen-chia-the-internet-marketing-and-affiliate-marketing-guru/index.html already exists. (4051/48177) http://REDACTED.com/uncategorized/reflective-thoughts-on-marriage/ # websites/ REDACTED.com/uncategorized/reflective-thoughts-on-marriage/index.html already exists. (4052/48177) http://REDACTED.com/uncategorized/doing-the-important-stuff/ # websites/ REDACTED.com/uncategorized/doing-the-important-stuff/index.html already exists. (4053/48177)
REDACTED.com/www.REDACTED2.com /usr/lib/ruby/1.9.1/fileutils.rb:1515:in `stat': No such file or directory
block in fu_each_src_dest' from /usr/lib/ruby/1.9.1/fileutils.rb:1531:in
fu_each_src_dest0'
from /usr/lib/ruby/1.9.1/fileutils.rb:1513:in fu_each_src_dest' from /usr/lib/ruby/1.9.1/fileutils.rb:508:in
mv'
from
/var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:116:in
rescue in structure_dir_path' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:109:in
structure_dir_path'
from
/var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:83:in
block in download_files' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:66:in
each'
from
/var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:66:in
download_files' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/bin/wayback_machine_downloader:27:in
<top (required)>'
from /usr/local/bin/wayback_machine_downloader:23:in load' from /usr/local/bin/wayback_machine_downloader:23:in
I tried to restart the process like you mentioned in another thread, however I got the same output as before ...
The server and software information: ubuntu@ip-172-30-0-198:~$ wayback_machine_downloader -v 0.1.15 ubuntu@ip-172-30-0-198:~$ ruby -v ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux] ubuntu@ip-172-30-0-198:~$ uname -a Linux ip-172-30-0-198 3.13.0-48-generic #80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux ubuntu@ip-172-30-0-198:~$ cat /etc/ Display all 178 possibilities? (y or n) ubuntu@ip-172-30-0-198:~$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=14.04 DISTRIB_CODENAME=trusty DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
All the best, Siggy ᐧ
On Sat, Sep 5, 2015 at 4:33 AM, hartator notifications@github.com wrote:
Do you mind sending me the backtrace non-redacted as well?
Just copy/past without any edition, I am not able to reproduce.
— Reply to this email directly or view it on GitHub https://github.com/hartator/wayback-machine-downloader/issues/8#issuecomment-137910021 .
I managed to fix the previous error by creating the folder by hand, however I hit a bug when I continued to robots.txt
http://www.REDACTED.com/tag/barack-obama/ # websites/ REDACTED.com/tag/barack-obama/index.html already exists. (9743/48177)
/usr/lib/ruby/1.9.1/fileutils.rb:1515:in `stat': No such file or directory
block in fu_each_src_dest' from /usr/lib/ruby/1.9.1/fileutils.rb:1531:in
fu_each_src_dest0'
from /usr/lib/ruby/1.9.1/fileutils.rb:1513:in fu_each_src_dest' from /usr/lib/ruby/1.9.1/fileutils.rb:508:in
mv'
from
/var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:116:in
rescue in structure_dir_path' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:109:in
structure_dir_path'
from
/var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:83:in
block in download_files' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:66:in
each'
from
/var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:66:in
download_files' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/bin/wayback_machine_downloader:27:in
<top (required)>'
from /usr/local/bin/wayback_machine_downloader:23:in load' from /usr/local/bin/wayback_machine_downloader:23:in
ᐧ
On Sat, Sep 5, 2015 at 12:44 PM, Sigurður Guðbrandsson < sigurdur@sigginet.info> wrote:
http://REDACTED.com/uncategorized/this-weeks-goals/ # websites/ REDACTED.com/uncategorized/this-weeks-goals/index.html already exists. (4048/48177)
http://www.REDACTED.com/79/affiliate-internet-marketing-campaign-kicks-off-great-bonuses/
websites/
REDACTED.com/79/affiliate-internet-marketing-campaign-kicks-off-great-bonuses/index.html already exists. (4049/48177)
http://www.REDACTED.com/78/at-last-a-bloggers-path-to-making-internet-marketing-money/
websites/
REDACTED.com/78/at-last-a-bloggers-path-to-making-internet-marketing-money/index.html already exists. (4050/48177)
http://www.REDACTED.com/72/ewen-chia-the-internet-marketing-and-affiliate-marketing-guru/
websites/
REDACTED.com/72/ewen-chia-the-internet-marketing-and-affiliate-marketing-guru/index.html already exists. (4051/48177) http://REDACTED.com/uncategorized/reflective-thoughts-on-marriage/
websites/
REDACTED.com/uncategorized/reflective-thoughts-on-marriage/index.html already exists. (4052/48177) http://REDACTED.com/uncategorized/doing-the-important-stuff/ # websites/ REDACTED.com/uncategorized/doing-the-important-stuff/index.html already exists. (4053/48177)
File exists - websites/
REDACTED.com/www.REDACTED2.com /usr/lib/ruby/1.9.1/fileutils.rb:1515:in `stat': No such file or directory
- File exists - websites/ REDACTED.com/www.REDACTED2.com (Errno::ENOENT) from /usr/lib/ruby/1.9.1/fileutils.rb:1515:in
block in fu_each_src_dest' from /usr/lib/ruby/1.9.1/fileutils.rb:1531:in
fu_each_src_dest0' from /usr/lib/ruby/1.9.1/fileutils.rb:1513:infu_each_src_dest' from /usr/lib/ruby/1.9.1/fileutils.rb:508:in
mv' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:116:inrescue in structure_dir_path' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:109:in
structure_dir_path' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:83:inblock in download_files' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:66:in
each' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:66:indownload_files' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/bin/wayback_machine_downloader:27:in
<top (required)>' from /usr/local/bin/wayback_machine_downloader:23:inload' from /usr/local/bin/wayback_machine_downloader:23:in
' ubuntu@ip-172-30-0-198:~$ I tried to restart the process like you mentioned in another thread, however I got the same output as before ...
The server and software information: ubuntu@ip-172-30-0-198:~$ wayback_machine_downloader -v 0.1.15 ubuntu@ip-172-30-0-198:~$ ruby -v ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux] ubuntu@ip-172-30-0-198:~$ uname -a Linux ip-172-30-0-198 3.13.0-48-generic #80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux ubuntu@ip-172-30-0-198:~$ cat /etc/ Display all 178 possibilities? (y or n) ubuntu@ip-172-30-0-198:~$ cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=14.04 DISTRIB_CODENAME=trusty DISTRIB_DESCRIPTION="Ubuntu 14.04.2 LTS"
All the best, Siggy ᐧ
On Sat, Sep 5, 2015 at 4:33 AM, hartator notifications@github.com wrote:
Do you mind sending me the backtrace non-redacted as well?
Just copy/past without any edition, I am not able to reproduce.
— Reply to this email directly or view it on GitHub https://github.com/hartator/wayback-machine-downloader/issues/8#issuecomment-137910021 .
Found the offending code and fixed it .. (at least in my case - you might have to add some if/then statements for parsing the input string correctly) https://github.com/hartator/wayback-machine-downloader/pull/11
Hey there,
Nice piece of software!! :)
I found a bug though. When downloading item 4053, the file already existed as a single file, thus a folder could not be created.
Here's the error: `http://REDACTED.com/uncategorized/reflective-thoughts-on-marriage/ -> websites/REDACTED.com/uncategorized/reflective-thoughts-on-marriage/index.html (4052/48177) http://REDACTED.com/uncategorized/doing-the-important-stuff/ -> websites/REDACTED.com/uncategorized/doing-the-important-stuff/index.html (4053/48177)
File exists - websites/REDACTED.com/www.REDACTED2.com
/usr/lib/ruby/1.9.1/fileutils.rb:1515:in
stat': No such file or directory - File exists - websites/REDACTED.com/www.REDACTED2.com (Errno::ENOENT) from /usr/lib/ruby/1.9.1/fileutils.rb:1515:in
block in fu_each_src_dest' from /usr/lib/ruby/1.9.1/fileutils.rb:1531:infu_each_src_dest0' from /usr/lib/ruby/1.9.1/fileutils.rb:1513:in
fu_each_src_dest' from /usr/lib/ruby/1.9.1/fileutils.rb:508:inmv' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:116:in
rescue in structure_dir_path' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:109:instructure_dir_path' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:83:in
block in download_files' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:66:ineach' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/lib/wayback_machine_downloader.rb:66:in
download_files' from /var/lib/gems/1.9.1/gems/wayback_machine_downloader-0.1.15/bin/wayback_machine_downloader:27:in<top (required)>' from /usr/local/bin/wayback_machine_downloader:23:in
load' from /usr/local/bin/wayback_machine_downloader:23:in<main>'
Here's an ls of the file
ubuntu@ip-172-30-0-198:~$ ll websites/REDACTED.com/www.REDACTED2.com -rw-rw-r-- 1 ubuntu ubuntu 27286 Sep 4 07:44 websites/REDACTED.com/www.REDACTED2.com