Open cmillet2127 opened 6 months ago
We are using Brave, and the accept cookies modals are actually removed by the browser before our custom behaviors are run, so the screenshot should actually reflect that, I believe. But, your point stands that it could be interesting to take a screenshot after autoscroll, etc..
In my situation, I'm encountering an unusual behavior while running Docker on Windows with the most recent image release. When utilizing the browsertrix-crawler, the 'accept cookies' modal persists. However, when navigating manually with Brave browser, the modal does not appear. Initially, I suspected that my image was still employing Chrome, but your confirmation of its use of Brave has led me to reconsider.
Example: docker run -v c:\tmp\crawls\:/crawls/ -it webrecorder/browsertrix-crawler crawl --url https://www.abarth.fr --generateWACZ final-to-warc --text --wait-until domcontentloaded --screenshot thumbnail,view,fullPage --scopeType page --blockAds
We are using Brave, and the accept cookies modals are actually removed by the browser before our custom behaviors are run, so the screenshot should actually reflect that, I believe. But, your point stands that it could be interesting to take a screenshot after autoscroll, etc..
Indeed, an additional suggestion might involve capturing a screenshot through a custom behavior using a 'utils' method. This approach would allow us to incorporate it into the WARC file, aligning with the methodology used for other screenshots.
If you run webrecorder/browsertrix-crawler
it will use webrecorder/browsertrix-crawler:latest
, which currently still points to the non-Brave version, unless you check out the repo and build it locally. You can try the latest beta release with webrecorder/browsertrix-crawler:1.0.0-beta.7
.
We hope to release the 1.0.0 version soon and then it will be latest.
@ikreymer is this still of interest? it would be extremely useful for us as some websites load images dynamically during scrolling, and therefore are missing if doing a fullpage screenshot before custom behaviours. I am quite lost in the code as unfamiliar with js, if pointed to right place of screenshot logic I can try something out and provide a PR
Currently it seems screenshot are made before custom behaviors.
It could be very interesting to be able a post-custom behaviors screenshot. For example to capture screenshot after removing the "accept cookies" modals.