Y2Z / monolith

⬛️ CLI tool for saving complete web pages as a single HTML file
https://crates.io/crates/monolith
Creative Commons Zero v1.0 Universal
11.28k stars 317 forks source link

Image not saved correctly #339

Open argium opened 1 year ago

argium commented 1 year ago

Monolith does not correctly save some images from this URL. The image's URL contains %width% instead of 900 which returns a 404. Does this need to be piped from headless chromium?

URL: https://meatsmith.com.au/blogs/recipes/a-comforting-beef-stroganoff

Command

monolith.exe -I -o "Meatsmith - A COMFORTING BEEF STROGANOFF.html" https://meatsmith.com.au/blogs/recipes/a-comforting-beef-stroganoff

Expected

https://cdn.shopify.com/s/files/1/1075/3026/articles/2023_Meatsmith_Parker_Blain_STROGANOFF_LR18_900x.jpg?v=1687835702

image

Actual

https://cdn.shopify.com/s/files/1/1075/3026/articles/2023_Meatsmith_Parker_Blain_STROGANOFF_LR18_%7Bwidth%7Dx.jpg?v=1687835702 (404 Not Found)

image

Output:

 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/jquery-1.11.0.min.js?v=74721525869110791951682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/jquery-migrate-1.2.1.min.js?v=163044760040938828711682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/slick.min.js?v=71779134894361685811682340175
 https://cdn.shopify.com/s/files/1/1075/3026/files/meatsmith-favicon_9440056a-6ef4-4bb3-a3bf-6f364ba2d3b5_32x.png?v=1656473257
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/theme--async.css?v=78274724801891975391682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/theme--critical.css?v=25008363659742577551682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/theme--async.css?v=78274724801891975391682340175 (from cache)
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/CalibreWeb-Regular.eot?v=136609429431039942021682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/CalibreWeb-Regular.woff2?v=73836756305408748181682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/CalibreWeb-Regular.woff?v=141770682351173391201682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/CalibreWeb-Semibold.eot?v=9448648378516762051682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/CalibreWeb-Semibold.woff?v=166662711615466392821682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/PitchWeb-Bold.eot?v=91764299517701957261682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/PitchWeb-Bold.woff?v=80838060911856203141682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/ClearfaceStdRegular.eot?v=142477754856511909971682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/ClearfaceStdRegular.woff2?v=80156754253652419181682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/ClearfaceStdRegular.woff?v=80733345188957752861682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/lazysizes.min.js?v=153528224177489928921682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/theme.js?v=12385560275846379891682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/product-price.js?v=19449638858902942161682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/product-options.js?v=18580892987082915751682340175
 https://cdn.shopify.com/s/files/1/1075/3026/t/91/assets/product-buy-buttons.js?v=164148370001968371021682340175
 https://cdn.shopify.com/shopifycloud/shopify/assets/storefront/load_feature-3f13ad638dda6342084642726e80965205d5b82d761805d0f0b2850313bc1fdf.js
 https://cdn.shopify.com/shopifycloud/shopify/assets/shopify_pay/storefront-c31d2fa4962d2ef90b673e945ee33f4f87302b97d0882cd8e83a629b84b30dab.js?v=20220906
 https://cdn.shopify.com/shopifycloud/shopify/assets/storefront/features-87e8399988880142f2c62771b9d8f2ff6c290b3ff745dd426eb0dfe0db9d1dae.js
 https://script.crazyegg.com/pages/scripts/0079/5344.js
 https://cdn.shopify.com/extensions/d37deec6-fe06-4f9d-9bf7-90bcc632ebf8/1.54.0/assets/storepickup.js
 https://cdn.shopify.com/extensions/d37deec6-fe06-4f9d-9bf7-90bcc632ebf8/1.54.0/assets/pickup.js
 https://cdn.shopify.com/extensions/d37deec6-fe06-4f9d-9bf7-90bcc632ebf8/1.54.0/assets/delivery.js
 https://cdn.shopify.com/s/files/1/1075/3026/files/meatsmith_logo_600x200.png?v=1649653467
 https://cdn.shopify.com/s/files/1/1075/3026/files/Tbone.jpg
 https://cdn.shopify.com/s/files/1/1075/3026/files/Chefs_press.jpg
 https://cdn.shopify.com/s/files/1/1075/3026/files/Massaman.jpg
 https://cdn.shopify.com/s/files/1/1075/3026/files/GOSSET.jpg
 https://cdn.shopify.com/s/files/1/1075/3026/files/GOSSET.jpg (from cache)
 https://cdn.shopify.com/s/files/1/1075/3026/articles/2023_Meatsmith_Parker_Blain_STROGANOFF_LR18_%7Bwidth%7Dx.jpg?v=1687835702 (404 Not Found)
 https://cdn.shopify.com/s/files/1/1075/3026/articles/2023_Meatsmith_Parker_Blain_STROGANOFF_LR18_1024x1024.jpg?v=1687835702
 https://code.jquery.com/jquery-3.2.1.slim.min.js
 https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.12.9/umd/popper.min.js
 https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/js/bootstrap.min.js
 https://cdn.shopify.com/s/files/1/1075/3026/files/2023_Meatsmith_Parker_Blain_STROGANOFF_LR1.jpg?v=1687834054
 https://cdn.shopify.com/s/files/1/1075/3026/files/2023_Meatsmith_Parker_Blain_STROGANOFF_LR4.jpg?v=1687834054
 https://cdn.shopify.com/s/files/1/1075/3026/files/2023_Meatsmith_Parker_Blain_STROGANOFF_LR6.jpg?v=1687834054
 https://cdn.shopify.com/s/files/1/1075/3026/files/2023_Meatsmith_Parker_Blain_STROGANOFF_LR7.jpg?v=1687834054
 https://cdn.shopify.com/s/files/1/1075/3026/files/2023_Meatsmith_Parker_Blain_STROGANOFF_LR8.jpg?v=1687834054
 https://cdn.shopify.com/s/files/1/1075/3026/files/2023_Meatsmith_Parker_Blain_STROGANOFF_LR10.jpg?v=1687834054
 https://cdn.shopify.com/s/files/1/1075/3026/files/2023_Meatsmith_Parker_Blain_STROGANOFF_LR11.jpg?v=1687834054
 https://cdn.shopify.com/s/files/1/1075/3026/files/2023_Meatsmith_Parker_Blain_STROGANOFF_LR21.jpg?v=1687834054
 https://cdn.shopify.com/s/files/1/1075/3026/articles/2023_Meatsmith_Parker_Blain_REUBEN_LR30_%7Bwidth%7Dx.jpg?v=1686120355 (404 Not Found)
 https://cdn.shopify.com/s/files/1/1075/3026/articles/2023_Meatsmith_Parker_Blain_REUBEN_LR30_1024x1024.jpg?v=1686120355 https://cdn.shopify.com/s/files/1/1075/3026/articles/Meatsmith_March_lr_203_%7Bwidth%7Dx.jpg?v=1684819869 (404 Not Found)
 https://cdn.shopify.com/s/files/1/1075/3026/articles/Meatsmith_March_lr_203_1024x1024.jpg?v=1684819869
 https://cdn.shopify.com/s/files/1/1075/3026/files/10861_MEATSMITH_ESTMARK_SECONDARYWORDMARK_WHITE_250x.png?v=1633992057 https://cdn.shopify.com/s/files/1/1075/3026/files/marion-wine.png?v=1610960460
 https://cdn.shopify.com/s/files/1/1075/3026/files/cutler-logo.png?v=1610960460
 https://cdn.shopify.com/s/files/1/1075/3026/files/supernormal.png?v=1610960460
 https://cdn.shopify.com/s/files/1/1075/3026/files/Gimlet-Website_Logo.png?v=1610960460
 https://cdn.shopify.com/s/files/1/1075/3026/files/cumulus-inc.png?v=1610960459
 https://cdn.shopify.com/s/files/1/1075/3026/files/handmade.png?v=1614362810
 https://cdn.shopify.com/s/files/1/1075/3026/files/builders-arms-hotel.png?v=1610960459
 https://cdn.shopify.com/s/files/1/1075/3026/files/Morning-Market-Logo.png?v=1610960460
snshn commented 3 months ago

That first image seems to be lazy-loaded, this worked for me on Ubuntu using Chromium:

chromium --headless --window-size=1920,1080 --incognito --dump-dom https://meatsmith.com.au/blogs/recipes/a-comforting-beef-stroganoff | monolith - -I -b https://meatsmith.com.au/blogs/recipes/a-comforting-beef-stroganoff -o "Meatsmith - A COMFORTING BEEF STROGANOFF.html"