-
Apply https://github.com/psf/black to the `py-wacz` project (`black .`)
Also add a travis check `black . --check` which will verify the formatting.
Let's do it as a separate PR after the validatio…
-
Write a blog post soliciting use cases for WACZ. Use cases can be submitted via email or directly here on GitHub. Provide a template for what the use cases should look like.
-
Hi.
I've previously been using the Webrecorder and Webrecorder Player to record and replay webcollections on my Mac (Big Sur), but WebRecorder stopped working on my Mac, and I discovered that you had…
-
## Is your feature request related to a problem? Please describe.
The main problem with WARCs is that everytime you want to run them from cold-boot, you have to extract the file which takes time.
…
-
Combine the text and page index. It will be a simpler structure
-
Hello! Replayweb.page app (v1.4.0—1.5.2 on Widndows 10 20H2) stops at 30-40% loading of any .warc file > 1GB from this save of gamerankings.com — https://archive.fart.website/archivebot/viewer/job/9ux…
-
Convert existing draft spec text to ReSpec!
-
Screenshot:
![image](https://user-images.githubusercontent.com/2303841/136691878-e1a1e781-d2b4-4cee-ac6c-bf997f84589d.png)
Seems that it thinks it needs to crawl another page, but there aren't a…
-
Support increasing/decreasing number of pods running on a crawl.
Requires:
- [ ] Generate Crawl ID separately, not based on job/docker container id
- [x] Use Shared Redis for Crawl, instead of lo…
-
warc2zim now has a set of fuzzy matching rules (https://github.com/openzim/warc2zim/blob/master/src/warc2zim/main.py#L75)
which are a subset of the larger ruleset in wabac.js (https://github.com/web…