Closed ljq29 closed 1 month ago
So this is definitely not a general problem, as it works just fine and you are the first one reporting this. Considering that you had issues setting up your environment before (https://github.com/hoarder-app/hoarder/issues/487), did you check everything is correct?
I found an example at this link, where the original image is:
And the screenshot in Hoarder is:
Additionally, I just noticed that Hoarder uses the original image link, rather than caching the image to its own server like Cubox does. Would it be possible to improve this aspect?
please check out the configs flags in the documentation. You can already configure the archives by setting CRAWLER_FULL_PAGE_ARCHIVE to true.
From what I can tell, those requests are first blocked by the browser due to Opaque Request Blocking. Even if we were to prevent that with some changes, Baidu simply does not want you to embed their images in other webpages, so this does not work. For the preview you'll have to live with that. If you configure the archiving, everything is downloaded correctly, because it is no longer constrained by the browser rules.
please check out the configs flags in the documentation. You can already configure the archives by setting CRAWLER_FULL_PAGE_ARCHIVE to true.
From what I can tell, those requests are first blocked by the browser due to Opaque Request Blocking. Even if we were to prevent that with some changes, Baidu simply does not want you to embed their images in other webpages, so this does not work. For the preview you'll have to live with that. If you configure the archiving, everything is downloaded correctly, because it is no longer constrained by the browser rules.
As for your issue with the env
configuration, where exactly should this be placed within the containers? I tried deploying it in the worker container, but it did not work.
When I placed the env
configuration in the web container, several web pages repeatedly failed to fetch:
yes, in the worker environment variables (Btw: you are using the old setup, where web and worker are separate docker containers). You probably did not look at the "Archive" tag in the preview, but at the same screen from above
Bug: Images Are Not Loaded Correctly - All Images Appear as "Broken" in the Webpage
Description
When using Hoarder to scrape images from a webpage, none of the images are loaded correctly. Instead, all the images appear as "broken" or missing. It seems that Hoarder is failing to fetch the images properly from the webpage.
Steps to Reproduce
Expected Behavior
The images should be properly fetched and displayed in the output without being broken.
Actual Behavior
All images on the webpage show up as broken, indicating that the image URLs or fetching process might not be working as expected.
Possible Causes
Environment
Additional Context
Any webpage with images will have this issue, making it impossible to scrape or view images correctly.