gildas-lormeau / single-file-cli

CLI tool for saving a faithful copy of a complete web page in a single HTML file (based on SingleFile)
GNU Affero General Public License v3.0
602 stars 63 forks source link

Saved page missing images using CLI, while fine with browser extension #39

Open beefdrifter opened 3 years ago

beefdrifter commented 3 years ago

Describe the bug https://blogs.cisco.com/datacenter/the-napkins-dialogues-life-of-a-packet-walk-part-1 When saving the page above with the browser extension, the saved page is accurate. However, if it is saved with the CLI, the images would be missing

To Reproduce Steps to reproduce the behavior:

  1. Save the page with browser extension
  2. Saved page is around 2mb. Displays images accurately.
  3. Save the page with CLI
  4. Saved page is around 350kb. Is missing images

Expected behavior CLI and browser extension results should be the same (at least I hope).

Screenshots If applicable, add screenshots to help explain your problem.

Environment

gildas-lormeau commented 3 years ago

I cannot reproduce the issue. On my end the page saved with SingleFile CLI weights ~ 2.3 MB. Did you do the test with the default settings? It looks like your issue is related to --load-deferred- options or maybe --save-raw-page.

beefdrifter commented 3 years ago

I think I have narrowed down the possibilities, but I'll answer first

Did you do the test with the default settings?

Yes. Using the default settings made it save properly

It looks like your issue is related to --load-deferred- options or maybe --save-raw-page.

The --load-deferred- in my settings is set to True and --save-raw-page is set to False I'll include a screenshot below

2021_0203_0316

I added my settings to the default args.js one parameter at a time turns out the issue reappears when I use my Chrome profile At first I thought it had to do with my extensions but after removing all of them, the problem persists That's as far as I managed to test

gildas-lormeau commented 3 years ago

Can you try to set load-deferred-images-keep-zoom-level to true and check if it helps to fix this issue?

beefdrifter commented 3 years ago

Unfortunately no. Same results.

takingurstuff commented 2 months ago

I'll leave a tip here. In the newer versions there is an option called browser-wait-until, it defaults to DOMContentLoaded which does not load the images or stylesheets, change the value to load will wait for the load signal and will load any images and stylesheets.