hartator / wayback-machine-downloader

Download an entire website from the Wayback Machine.
Other
5.25k stars 691 forks source link

Invalid characters under Windows #78

Open MSSnacks opened 7 years ago

MSSnacks commented 7 years ago

Files with question marks and such in the URLs (as in, website.com/index.php?/directory/etc.) don't download on Windows, and from cursory inspection it looks like that's because you've only provisioned for file_path and not dir_path.

polepoe commented 7 years ago

Thanks for your reply, It might solve my problem.

sergeyganago commented 7 years ago

Have same problem. It makes tool useless under Windows :(

sergeyganago commented 7 years ago

Seems like this edit of wayback_machine_downloader.rb fixed the problem:

after file_path = file_path.gsub(/[:*?&=<>\\|]/) {|s| '%' + s.ord.to_s(16) } add dir_path = dir_path.gsub(/[:*?&=<>\\|]/) {|s| '%' + s.ord.to_s(16) }

iagovar commented 3 years ago

Not working, Using Windows with Ruby 2.7.0

Here's an example of output:

https://www.domain.com/newreply.php?tid=66077&replyto=1694445 -> D%3a%5cdomain/newreply.php%3ftid%3d66077%26replyto%
3d1694445 (316/1729168)

Current code

    if Gem.win_platform?
      dir_path = dir_path.gsub(/[:*?&=<>\\|]/) {|s| '%' + s.ord.to_s(16) }
      file_path = file_path.gsub(/[:*?&=<>\\|]/) {|s| '%' + s.ord.to_s(16) }
    end

Any suggestion? I really need to set up --directory argument to other drive, that's the most pressing issue for me.