Open ctschach opened 4 years ago
Somehow related to this: Some sites use a static base64 encoded image in srcset
attribute to show a loading image, which leads to the browser not loading the actual image inside miniflux.
Can be seen "in the wild" here: https://bahnblogstelle.net/2021/04/04/hamburger-entwickelt-suchmaschine-fuer-nachtzugreisen/ (the big image below the heading)
I have managed to rewrite the srcset
use with the following:
replace("srcset"|"")
After adding this rewrite rule all the images are finally shown.
Thanks for the hint! That rewrite rule did the trick for me:
replace("<img "|"<ignore "),replace("a-img"|"img")
Thanks for the hint! That rewrite rule did the trick for me:
replace("<img "|"<ignore "),replace("a-img"|"img")
For me, this loads the full sized shutterstock images, which are sometimes as large as 10M.
I instead used use_noscript_figure_images
to use the smaller images of the noscript part. Moreover, removing .branding
and footer
lead to a very clean article.
My complete rewrite rules for heise:
use_noscript_figure_images,remove(".branding,footer")
Again, more a feature request:
The "Fetch original content" together with the scrapper is a life-saving tool. However, some pages uses lazy load functions to include images or place image loading into noscript-tags. This means the images are not shown in the reader.
Having a search-and-replace function (probably based on regex) would allow us to adjust the tags, so that images are included properly into the article.
Due to the "a-img" tag instead of a plain "img" tag, the image is not shown.