-
With browsertrix-crawler, a user can use `combineWARC` to write contextual information defined in the `warcinfo` property into the destination warc. When the warc is read, the fields defined in the pr…
-
please help dear team :
my-web-archive.wacz Size:8 GB loads extremely slow and will take forever to index the page
is there a way to make it faster so that pages inside the recorded archieve " m…
-
besides the general metadata that could be contained in **webarchive.yaml** i would suggest to consider the use of [Namaste](https://confluence.ucop.edu/display/Curation/Namaste) tags to have some met…
-
### Browsertrix Version
v1.10.2-dc9069d
### What did you expect to happen? What happened instead?
The tv2.dk front page was crawled with brave archiveWeb chromium extension at Thursday 11:16 where …
-
I'm not sure if this is a feature request or just a request for clarification, but I'm looking for a canonical way to generate a WACZ file from multiple WARC files.
I am dealing some web collection…
-
I wanted to suggest the idea of providing an embedded viewer for web archives that are stored in SDR as discreet objects.
Using this viewer, it should be possible to support embedded replay of web …
-
### ReplayWeb.page Version
v2.0.2
### What did you expect to happen? What happened instead?
I recorded a specific website but it couldn't replay properly.
Normally, after the web progress ba…
-
### Browsertrix Version
v1.11.0-4aca107
### What did you expect to happen? What happened instead?
Ran out of space. Expected jobs to pause gracefully and resume on free space. Jobs halted and was u…
-
For a remote WARC or WACZ, need to detect if it has changed and due a purge cache + full reload before trying to load anything, as all previously cached data should be considered invalid.
This is a…
-
I was looking at WACZ files generated from ReplayWeb.page and noticed that the WARC file under the `archive/` folder is always named `data.warc.gz` (or `data.warc`). It looks like the file name is har…