issues
search
CorentinB
/
warc
Read and write WARC files in Go
Creative Commons Zero v1.0 Universal
16
stars
4
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Basic DNS fallback
#55
yzqzss
opened
2 weeks ago
2
Correctly closing records when errors occur before sending to channel
#54
NGTmeaty
closed
1 month ago
0
Update default size threshold to 2048 bytes
#53
NGTmeaty
closed
1 month ago
0
Fix DNS query string
#52
NGTmeaty
closed
1 month ago
0
Stop writing revisit records for 3I42H
#51
NGTmeaty
closed
1 month ago
0
Add WARC-Cipher-Suite and WARC-Protocol
#50
NGTmeaty
opened
1 month ago
0
Add custom DNS resolver
#49
CorentinB
closed
1 month ago
0
Add AnyIP capabilities to randomly bind to any IPv6 in the assigned prefix
#48
equals215
closed
1 month ago
0
Add DisableIPv4 & DisableIPv6
#47
CorentinB
closed
1 month ago
0
fix: actually verify payload digest when needed.
#46
NGTmeaty
closed
1 month ago
0
Remove logrus usage
#45
CorentinB
closed
1 month ago
0
Add WARC-Cipher-Suite and WARC-Protocol
#44
NGTmeaty
closed
1 month ago
0
Panic on too many open files error, should retry instead
#43
willmhowes
opened
2 months ago
2
Use a more up to date UUID library
#42
CorentinB
closed
2 months ago
0
ZSTD Dictionary support
#41
NGTmeaty
closed
1 month ago
0
Add `WARC-Cipher-Suite` and `WARC-Protocol` WARC headers
#40
NGTmeaty
opened
3 months ago
0
Various test improvements
#39
NGTmeaty
closed
3 months ago
0
Capture DNS requests
#38
CorentinB
closed
1 month ago
0
Support reading more compression formats
#37
yzqzss
closed
3 months ago
0
Add dedupe stats
#36
CorentinB
closed
3 months ago
0
Add command line utilities
#35
CorentinB
closed
7 months ago
0
Add cookie support to CDX request
#34
NGTmeaty
closed
9 months ago
0
Add DNS archiving
#33
CorentinB
closed
1 month ago
0
DRAFT: Testing improvements
#32
NGTmeaty
closed
3 months ago
0
Improved WARC erroring
#31
NGTmeaty
closed
1 year ago
0
Fix revisit records
#30
CorentinB
closed
1 year ago
0
Don't write "ERROR" for payload digest in case of error
#29
NGTmeaty
closed
1 year ago
0
Add counter for the total amount of data crawled since startup
#28
CorentinB
closed
2 years ago
0
Fix proxy implementation
#27
CorentinB
closed
2 years ago
1
Update uTLS to the latest version with certificate compression
#26
NGTmeaty
closed
2 years ago
0
Changing our TLS fingerprint
#25
NGTmeaty
closed
2 years ago
0
WARC files with more than one in the pool aren't sequentially number overall
#24
NGTmeaty
closed
1 year ago
1
Add parallel gzip for payloads over 1MB & parallel WARC writing
#23
CorentinB
closed
2 years ago
0
Add benchmark and calculate Block-Digest on dialer.go
#22
NGTmeaty
closed
2 years ago
1
Move to klauspost/compress/gzip
#21
CorentinB
closed
2 years ago
0
Add a setting to only write to disk
#20
NGTmeaty
closed
2 years ago
0
Add: struct to configure the HTTP client
#19
CorentinB
closed
2 years ago
0
fix: remove nil assignments
#18
fionera
closed
2 years ago
0
feat: Use a pool for spooledTempFile buffers
#17
fionera
closed
2 years ago
0
Add TLS specific things & tempDir
#16
CorentinB
closed
2 years ago
0
Using spooledTempFile to dynamically move payloads to disk when they become too big
#15
NGTmeaty
closed
2 years ago
0
Add toggle for verifying HTTP certificates.
#14
NGTmeaty
closed
2 years ago
0
Add error channel
#13
NGTmeaty
closed
2 years ago
0
Remove unused for creating temporary folders
#12
NGTmeaty
closed
2 years ago
0
Ignore CDX errors
#11
NGTmeaty
closed
2 years ago
0
Some minor testing improvements
#10
NGTmeaty
closed
2 years ago
0
Add: goroutine leak detector
#9
CorentinB
closed
2 years ago
0
Memory management fixes
#8
CorentinB
closed
2 years ago
0
Fix memory leak
#7
CorentinB
closed
2 years ago
0
DRAFT: Write WARC responses to temporary files to (hopefully) avoid OOM issues.
#6
NGTmeaty
closed
2 years ago
0
Next