issues
search
internetarchive
/
warctools
Command line tools and libraries for handling and manipulating WARC files (and HTTP contents)
MIT License
149
stars
27
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Python 3.10 / Ubuntu 22.04
#32
WhenSkiesAbove
opened
1 year ago
0
Extract entire WARC file?
#31
catharsis71
opened
2 years ago
0
Travis CI: Upgrade Python versions
#30
cclauss
closed
3 years ago
0
Error when running warcfilter
#29
nhart
opened
4 years ago
0
Python 3.5 support for raw GzipFile offset
#28
siznax
opened
4 years ago
0
Use tox instead of nose in setup.py
#27
siznax
opened
4 years ago
0
Made author Internet Archive, added classifiers, etc.
#26
siznax
closed
4 years ago
0
Allow build failures for python 3.2, 3.3, 3.4, pypy3
#25
siznax
opened
4 years ago
0
Markdown formatted README, added python3 WARC writing example and usage.
#24
siznax
closed
4 years ago
0
TypeError (str/bytes) in warc.py error path
#23
bnewbold
opened
6 years ago
0
Enable -o parameter in warc2warc tool
#22
checktor
closed
5 months ago
0
do not stop iterating in case of an empty gzip record, yield record=N…
#21
nlevitt
closed
6 years ago
0
skip over gzip members that decompress to the empty file
#20
nlevitt
closed
6 years ago
2
Normalizing '4.10.1dev2'
#19
DonRichards
closed
8 years ago
0
some refactoring, fixes, new tests, cleanup of cruft
#18
nlevitt
closed
8 years ago
0
record dumper assumes content type and content length
#17
jnioche
opened
8 years ago
0
custom class MultiMemberGzipReader and other tweaks
#16
nlevitt
opened
8 years ago
0
G-Zip Content-Length
#15
lljrsr
opened
8 years ago
0
Hack to handle g-zip when parsing WARC files
#14
lljrsr
closed
8 years ago
25
imports for python version 2.7.x
#13
trypy
opened
9 years ago
0
warcfilter.py needs to import logging
#12
kwantopia
opened
9 years ago
1
Cannot use warc library to open WAT files
#11
brenreyes
closed
10 years ago
2
warcvalid.py: Report exceptions
#10
pmyteh
closed
10 years ago
0
new warcfilter option to filter on the WARC-Date header field
#9
nlevitt
closed
10 years ago
0
python3 support
#8
nlevitt
closed
10 years ago
2
`pip install` is broken on Linux and OSX
#7
rajbot
opened
10 years ago
1
streaming record content when reading, and some other tweaks
#6
nlevitt
closed
10 years ago
7
Some changes I needed to make for https://github.com/nlevitt/warcprox
#5
nlevitt
closed
10 years ago
1
Streaming interface to warc files.
#4
tef
opened
10 years ago
3
ArcParser raises exception instead of returning error info as WarcParser does
#3
kngenie
opened
10 years ago
1
Drop hanzo namespace as not to collide with original hanzo-warc-tools.
#2
tef
opened
10 years ago
1
warcpayload.py missing from entry_points
#1
atomotic
closed
10 years ago
0