warc Search Results - Githubissues

1000+ results
for warc

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ArchiveTeam/grab-site #228

is it possible to output regular files instead of warc?

i only want files, not warc. can grab-site output regular files (like html and images) for me like wget can? (links must be converted to relative links) side question: has anyone here actually …

ftc2 updated 2 months ago
6
birros/web-archives #27

WARC file support?

Hi! When I found out about this project, its name made me think it was a tool to read [WARC files](https://en.wikipedia.org/wiki/Web_ARChive), which stands for... Web ARChives! Is there support …

anarcat updated 2 years ago
2
internetarchive/liveweb #43

warc support

First of all, thank you for this great project! Is there any reason why ARC support was implemented first? Is WARC support planned for the near future?

antoinerg updated 11 years ago
1
webrecorder/warcit #11

warc format

I downloaded a website from Internet Archive using [wayback-machine-downloader](https://github.com/hartator/wayback-machine-downloader) then created a WARC using warcit with the following command: `wa…

Natkeeran updated 6 years ago
2
netarchivesuite/netarchivesuite #34

Compressed warc files

Hello, I would like to know if it possible to get both warc files compressed (not only the metadata one) Thanks

nasry updated 6 years ago
7
huggingface/datatrove #214

Assign more cpu to single task to speed it up for local exec…

I am using the local executor. My machine has 48 Cpus with 348 Ram. Any idea how to speed this up? Currently one single task (task=1, running for 1 warc.gz file, with size ~1g) takes half an hour. Thi…

barbara-su updated 3 months ago
5
iipc/warc-specifications #52

WARC-Conversion-Software and WARC-Conversion-Command fields

When converting content in an archive it is useful for diagnostic purposes to record the versions of major software components used and important conversion options. Another common use case is to iden…

ato updated 5 years ago
5
openzim/warc2zim #398

Dynamic scripts containing const and let variables are not e…

This is the same issue as https://github.com/webrecorder/wombat/issues/82, which has been partially solved in https://github.com/webrecorder/wabac.js/pull/128 (and few other commits). Note that htt…

benoit74 updated 6 days ago
1
openzim/warc2zim #340

Revisit `WARC-Resource-Type` content or add a new header

See https://github.com/webrecorder/browsertrix-crawler/issues/630

benoit74 updated 1 month ago
1
Lisias/pywb #2

The `SkipHttp` thingies are not working

They are being correctly parsed from the config file, and they are being correctly instantiated. But they are not working, these pesky http 4xx & 5xx errors are still being wrote on the `WARC` file…

Lisias updated 1 month ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for warc

1000+ results
for warc