-
This emerged from our 2018-Feb-19 Roundtable ["Licensing of Data Together libraries (CDXJ & WARC packages)"](https://github.com/datatogether/datatogether/issues/48)
-
In the 1.1 spec, section 5.19, 'WARC-Identified-Payload-Type' is allowed for anything with a well-defined payload.
That makes sense for response, resource, and conversion.
That doesn't make sens…
-
_(Based on https://github.com/oduwsdl/ipwb/issues/474)_
I am using [ipwb](https://github.com/oduwsdl/ipwb) to replay local WARCs on my own machine. The replay interface is accessible via the browse…
-
Sometimes we come across URLs that can't be fetched for the first time, there may be several reasons for that. Some of these succeed on a subsequent try, some can never be fetched. A logical approach …
bzc6p updated
6 years ago
-
I've tried for many hours now to configure pywb to first try my collection of warcs, then fallback to archive.org then fallback to the live web. The documentation makes it seem like it should be possi…
-
Using the advice in Issue #453, I successfully excluded unwanted PDF-documents from fetching and being written to WARC. But this method seems to generate misleading reports and stats.
## mimetype-…
-
For some reason, if I use `find_package` to find zlib dependency within `conan` packages, it fails despite linking against the library:
```
/usr/bin/g++-7 -Wall -pedantic -fno-strict-aliasing -mar…
-
In PR#14, @ibnesayeed suggested me limit the dates generated from the epoch to the current date/time based on the default behavior of the Faker module.
This functionality would be useful, but it is…
-
_(Please forgive me if this is the wrong venue for questions such as this. I'm not aware of any other avenue.)_
I'm super impressed with pywb and have enjoyed wrapping my head around both the code …
-
Tested in both the basic and advanced interface, tried crawling https://matkelly.com and the default https://matkelly.com/wail, both resulting WARCs only contain the DNS record.
Other URIs seem to …