issues
search
helgeho
/
Web2Warc
An easy-to-use and highly customizable crawler that enables you to create your own little Web archives (WARC/CDX)
MIT License
24
stars
4
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
sbt assembly fails
#6
machawk1
opened
5 years ago
4
followRedirects = true does not work on seeds
#5
dportabella
closed
5 years ago
3
OutOfMemoryError
#4
dportabella
closed
5 years ago
3
Allow Web2Warc to crawl only one document
#3
thorkill
closed
7 years ago
0
publish it to maven central or some other repository
#2
dportabella
closed
7 years ago
12
Make the WARC-Record-ID conform to the specification
#1
thorkill
closed
8 years ago
0