hsiehsh168168 / warc-tools

Automatically exported from code.google.com/p/warc-tools
0 stars 0 forks source link

SRS 52 — Extensions to "HTTrack", "wget" and "curl" incorporating libwarc shall be provided as patches to recent and specific versions of each tool... #58

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
SRS 52 — Extensions to "HTTrack", "wget" and "curl" incorporating libwarc
shall be provided as patches to recent and specific versions of each tool,
to enable users of the tool to access functionality of libwarc

Original issue reported on code.google.com by gordon.p...@gmail.com on 27 Jul 2008 at 10:15

GoogleCodeExporter commented 8 years ago
Curl -- curl only downloads one file at a time, so little need for integration 
with
libwarc. Instead, Hanzo created a python command line tool (url2warc.py) for
downloading multiple files using curl (using any of the many protocols 
available to
libCurl) and storing in a WARC file.

Wget -- integration on hold as author is refactoring code and not accepting
contributions.

HTTrack -- not sure of status, similar to wget?

Original comment by gordon.p...@gmail.com on 24 Oct 2008 at 12:20