-
When logged in Facebook I get a popup when choosing to see comments (not on all Facebook-pages but on some - see below). This new behaviour is not supported in archiveweb.page and the AutoPilot-funct…
-
Chrome [recently added in v101](https://developer.chrome.com/blog/new-in-devtools-101/#recorder) a new framework-agnostic [JSON user script export](https://developer.chrome.com/docs/devtools/recorder/…
-
It would be useful if you could be able to figure out what are considered links or clickable element, to make it easier to debug when a resource is not indexed.
Perhaps having some kind of overlay …
-
Thanks for this elegant example of how to do RAG with WARC data! I also very much appreciated how the [blog post](https://lil.law.harvard.edu/blog/2024/02/12/warc-gpt-an-open-source-tool-for-exploring…
edsu updated
9 months ago
-
This is a special request for Zimit 2.0 project. Devs will handle this first to test the new scraper, and only once it's working it will be transfered to content team.
- Website URL: https://www.bb…
-
I have a request regarding the documentation.
There are three topics that are underdocumented. It would be useful for people(like me) if docs were available for these:
1. recrawls, how to do them …
-
### ZIM(s) location
https://library.kiwix.org/viewer#theworldfactbook_en_all_2023-12/A/www.cia.gov/the-world-factbook/
### Recipe(s) URL
https://farm.openzim.org/recipes/CIAworldfactbook_en_all/edi…
-
I've tested it with instagram and facebook, but whenever I start with autopilot, it's just not recording anything. It says it's done after capturing 0 posts. In Instagram autopilot is even done before…
-
By default (when the `autofetch` behavior is activated if I'm not mistaken), the crawler automatically fetches images from `srcset` of `` tags so that all resolutions are available in the WARC.
How…
-
The recipe of Radiopedia.org has succeeded in hidden/dev, the file size is 2.3 GB
But the internal links of the website inside the file are not working.
https://farm.openzim.org/pipeline/a1f80241-f…