-
add methods to HttpExchange to convert from and to WARC files
-
Our Rosetta Working group has identified a couple file formats which use GZIP instead of regular ZIP as a container of an existing format.
* Some institutions use GZIP to compress WARC files. They …
-
Hi!
When I found out about this project, its name made me think it was a tool to read [WARC files](https://en.wikipedia.org/wiki/Web_ARChive), which stands for... Web ARChives!
Is there support …
-
Honestly, I feel like I should implement a command-line switch to generate WARC files while downloading threads, so I can upload them to the [Wayback Machine](http://web.archive.org/) or do whatever e…
-
340 WARC files of the news crawl data set, starting from 2020-09-12 until 2020-10-04 have been captured using [HTTP/2](https://en.wikipedia.org/wiki/HTTP/2) after a [Java security upgrade](https://mai…
-
First of all, thank you for this great project!
Is there any reason why ARC support was implemented first? Is WARC support planned for the near future?
-
I downloaded a website from Internet Archive using [wayback-machine-downloader](https://github.com/hartator/wayback-machine-downloader) then created a WARC using warcit with the following command: `wa…
-
When converting content in an archive it is useful for diagnostic purposes to record the versions of major software components used and important conversion options. Another common use case is to iden…
-
There are various tools that enable WARCs to be analyzed, indexed and searched (ex: https://archivesunleashed.org/aut/, https://archivesunleashed.org/warclight/). I am wondering if it is possible to …
-
Hello,
I would like to know if it possible to get both warc files compressed (not only the metadata one)
Thanks
nasry updated
6 years ago