up209d / ResourcesSaverExt

Chrome Extension for one click downloading all resources files and keeping folder structures.
GNU General Public License v3.0
1.65k stars 337 forks source link

Html file empty #26

Closed david-littlefield closed 5 years ago

david-littlefield commented 5 years ago

Wow, I've been thoroughly searching for a complete html website downloader, and your repo is the closest thing I've seen. The simplicity, depth of files, and folder structure is incredible! Nice work!

Problem: The downloaded html file is practically empty. I tried downloading it several times - same outcome. And there didn't seem to be a way to download only the html file. I saved the outer html from the developer tools, but the links are divorced from the downloaded resources. =(

Screen Shot 2019-06-11 at 6 49 51 PM

Ideal outcome: The html file downloaded would include the complete outer html, and updated links connected to the downloaded files and folder structure. That way, the local resources could load instantly.

Extra Information: Awesome work! Also, not sure if that ideal outcome is your existing use case, but would love to know more about your intended use, as well as, future direction of this repo! =]

up209d commented 5 years ago

@captaindavepdx Hey mate, thank for using the ext, I thought that I fixed this issue, could you provide the version you are using? 0.1.8? It is a little bit hard to debug without a real test case. But here is the thing that I downloaded from Donald Trump Twitter, I think the HTML file is pretty fine.

Screen Shot 2019-06-12 at 8 25 09 pm
david-littlefield commented 5 years ago

@up209d, Hey man, thanks for the quick reply. Yes, I'm using version 0.1.8. I forgot to mention that I'm calling a custom script to scroll to the absolute bottom of the page. Could that make the difference?

var start = false;
var running = false;
var lastScrollY = 0;

function scroll(retryAttempt) {

    if (running == false && retryAttempt < 10) {
        running = true;

        window.scrollBy(0, 500);

        if (window.scrollY != lastScrollY) {
            lastScrollY = window.scrollY;
            window.scrollBy(0, 1000);
            setTimeout(function() {
                       running = false;
                       scroll(0);
                       }, 1000);
        } else {
            setTimeout(function() {
                       running = false;
                       scroll(retryAttempt + 1)
                       }, 1000);
        }
    } else {
        postMessage();
        return;
    }
}

if (start == false) {
    start = true;
    scroll(0);
}

Update: I tested UP without any custom scripts, and the html file downloaded successfully. Again, great work!

Question: Is it possible to link the html file to your downloaded resources? Including videos?

up209d commented 5 years ago

@captaindavepdx Ah I got what you mean now, it is more complicated to do that. That’s why I defined the extension is a downloading resources tool but not the website downloading tool. Its purpose is naive and simple that get everything from the source 1 to 1 greedily without any modification. Cooking the html content to serve everything locally is much harder than what the extension is capable for at the moment. Maybe you can try to create http-server on the local folders and map the localhost to the desired domain eg twitter.com. But again it is not quite a sweet solution.

david-littlefield commented 5 years ago

Right on, good to know, thanks @up209d!

Would an html file with a large scrollHeight still be within the scope of your extension?

Update: I've started to piece together a website downloader. If I figure out a simple way to connect the links, I'll post the solution. I think it'd be awesome if your extension could all of that!

up209d commented 5 years ago

@captaindavepdx In term of a long scrollHeight, I think you are doing correctly, because all assets downloading are triggered by browser itself, so we have to scroll down to make sure browser can load those hidden assets.

david-littlefield commented 5 years ago

@up209d Right on, I'll post progress updates regarding the connected links. If you have any suggestions, let me know.