-
When making a google search, Google adds random gibberish to the end of the URL. So when you webrecord a google search, you won't be able to replay the search - hitting the button will generate a new …
-
As we discussed in #3, we need some kind of archiving solution that can be trusted and that is good enough for archiving modern, JS infested websites with potentially hidden content. We decided to use…
-
Via @ikreymer, Web Archive Collection Zipped (WACZ) Format, https://github.com/webrecorder/wacz-format (MIT, potentially reusable)
Example of MDN WACZ at https://twitter.com/webrecorder_io/status/1…
-
This change breaks our `archive_paths: "webhdfs://server/" because `os.path.join` just discards the prefix when the suffix is an absolute path.
https://github.com/webrecorder/pywb/blob/92e459bda52a…
-
I am aware that browsertrix uses pywb in the background. I tried a website `https://www.kugou.com/` and noticed some notable missing elements with browsertrix.
I started pywb using
`docker run -e I…
-
Missing images both from the page itself as well as the reconstructive logo. WARC created with local webrecorder--built, run, and recorded using Docker and the webrecorder web interface: [temp-2018082…
-
I would prefer to have a settings menu where I can specify the location that the _warc_cache folder is stored, i.e. `~`, `/Volumes/Website\ Archives`, `G:\Data\Websites`.
Additionally, I would like…
-
Research and develop solutions for capturing, preserving, describing and providing access to archived websites. Investigate use of WebRecorder, ReplayWeb, integration of WARC view with AtoM.
-
I would like to know the position on this.
JWAT used the algorithm specified in the digest header directly.
So JWAT expects "SHA-256" since that seems to be the official name and the name supported …
-
For Hypothesis's proxy service we are looking to ensure that all requests to the proxy from a browser are either top-level page fetches or requests from a proxied page. In other words, we want to prev…