wacz Search Results - Githubissues

412 results
for wacz

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

q-m/scrapy-webarchive #11

Open WACZ files using the Scrapy stores

Currently we create the clients for fetching files from cloud providers ourselves (in `utils.py`/`wacz.py`). Ideally, we want to re-use the functionality that Scrapy has for this to reduce the complex…

leewesleyv updated 3 weeks ago
2
webrecorder/py-wacz #44

py-wacz fails without a `index.html` file

I want to create a `.wacz` from somewhat irregular collections of HTML/CSS/PDF files. To do so, I've decided to first shove these documents into a `.warc` using `warcit`, and then run `wacz create` on…

rien333 updated 3 weeks ago
1
webrecorder/browsertrix #2095

[Bug]: Error creating WACZ

### Browsertrix Version v1.11.7-7a61568 ### What did you expect to happen? What happened instead? I am having some DNS issues, probably from resource exhaustion. (Also filed #2094 to allow cpu_limi…

prestonvanloon updated 1 month ago
3
q-m/scrapy-webarchive #21

Improve performance of `get_warc_from_cdxj_record`

In this function we currently re-open the WACZ each time we request a WARC record. When using a cloud provider we keep fetching the file. Also when not using a cloud provider we should not need to re-…

leewesleyv updated 4 days ago
1
starlinglab/integrity-v2 #59

Browsertrix: support multi-wacz crawls

Multiple WACZs are created for crawls every 10 GB, and also if there are multiple crawler instances. This scenario needs to be tested to see what the webhook request looks like and how to handle it. C…

makew0rld updated 3 months ago
1
q-m/scrapy-webarchive #9

Support fetching live resources in downloader middleware

When using the downloader middleware and the request is not found, request the live resource. Add a setting or something alike that we can use the control this behaviour.

leewesleyv updated 1 month ago
1
digital-preservation/droid #887

WACZ File Format

@tnafrancesca Please can you add a bit of info and i'll put this in 6.7.0 board.

CaptainBrad updated 7 months ago
3
webrecorder/browsertrix #2051

[Bug]: it is not possible to reference a wacz file back to …

### Browsertrix Version v1.11.3-12f994b ### What did you expect to happen? What happened instead? When you download wacz files using the API you get wacz filenames like "20230225142507561-manual-20…

tuehlarsen updated 2 months ago
5
webrecorder/specs #112

WACZ Aggregation / Multi WACZ Specification

Details about how to aggregate multiple WACZ files into a single WACZ need to be added to the specification. This hinges on resources in the `datapackage.json` using a `url` for a WACZ rather than a `…

edsu updated 1 year ago
17
harvard-lil/js-wacz #16

WACZ reading / streaming

(Suggested by @ikreymer) Add a command and associated API for reading and streaming the contents WACZ files, either locally or remotely. See: https://www.npmjs.com/package/unzipit

matteocargnelutti updated 1 year ago
2

上一页 1...1 2 3 4 5 6 7...42 下一页

412 results for wacz

412 results
for wacz