Open RangerMauve opened 2 months ago
I was going to ask this:
However, if the target website requires JavaScript to display the content, you might need to use PuppeteerCrawler or PlaywrightCrawler instead, because it loads the pages using full-featured headless Chrome browser.
But also, what about using WARC so we can integrate with WebRecorder tools? We've been archiving sites and we don't have where to upload them, so DP would be great :D
Maybe use something like this JSDOM based crawler to download all the files? https://crawlee.dev/api/jsdom-crawler