-
The current API we're using restricts each WACZ file to 100MB. We would need to use different API to be able to upload files >100MB.
We should either: document this limit and perhaps add a check in t…
-
### What change would you like to see?
Screenshotting was enabled by default in https://github.com/webrecorder/browsertrix/pull/1518 for crawlers.
To reduce storage consumption, a configurable fl…
-
### Context
Our service generates data but right now we require users to click a _lot_ of stuff to download it all. Given the ease of mass data _creation_ we should facilitate a similar level of e…
-
A web archive (captured using ArchiveWeb.page) we are hosting in a repository contains 4 videos, which are all relatively large (around 1GB each). They play consistently after a relatively short wait …
-
One thing I'd like to work on for Browsertrix is some sort of metrics implementation. There have been past attempts at getting metrics working (specifically k8s metrics server for cpu / mem utilizatio…
-
We have been deploying Browsertrix Cloud under [K3S](https://k3s.io/). Making this work requires some changes to the base Browsertrix Cloud service so we can deploy locally.
- Must be possible to s…
-
We need deduplication to save storage in repeated crawls of the same job based on a dynamic created index of the previous crawl
-
I have a request regarding the documentation.
There are three topics that are underdocumented. It would be useful for people(like me) if docs were available for these:
1. recrawls, how to do them …
-
-
Hi, I've been looking to run some crawls of my organisation's Sharepoint/intranet site but I'm having some issues getting through Microsoft 2FA Authentication.
Using --interactive successfully crea…