issues
search
let4be
/
crusty
Broad Web Crawler
GNU General Public License v3.0
83
stars
3
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Error in redis dockerfile
#39
ErickJ3
closed
10 months ago
3
Attaching a database
#38
rtrevinnoc
closed
2 years ago
2
Concurrency auto-tuning
#37
let4be
opened
2 years ago
0
Review how we access DNS resolved addresses
#36
let4be
closed
2 years ago
1
Consider migrating to actix
#35
let4be
opened
3 years ago
0
Improve robots.txt support
#33
let4be
closed
3 years ago
1
lazy.sh: allow branch option
#32
let4be
closed
3 years ago
0
Lolhtml check if we can remove elements and if this saves some cpu cycles
#31
let4be
closed
2 years ago
1
Domain discovery: replace ttl cache with lru
#30
let4be
closed
3 years ago
0
Investigate possible deadlock/hanging of clickhouse writer
#29
let4be
closed
3 years ago
1
Glitchy buffers panel in grafana dashboard
#28
let4be
closed
3 years ago
0
Review config defaults
#27
let4be
closed
3 years ago
0
Unify clickhouse writing operations
#26
let4be
closed
3 years ago
0
Simple Job metrics
#25
let4be
closed
3 years ago
0
Add some other important buffer len graphs to dashboard
#24
let4be
closed
3 years ago
0
Investigate how we track errors and what is considered an error in grafana dashboard
#21
let4be
closed
3 years ago
2
Improve domain heuristics at domain_filter_map
#20
let4be
closed
3 years ago
0
Implement a first approximation of PageRank for Domains
#19
let4be
closed
3 years ago
2
Rework metrics for internal channels
#18
let4be
closed
3 years ago
0
Make crawling rules configurable via config
#17
let4be
closed
3 years ago
0
Investigate why crawling fails to start from some websites
#16
let4be
closed
3 years ago
0
curl script for super-fast start
#15
let4be
closed
3 years ago
0
Make DNS resolver configurable
#14
let4be
closed
3 years ago
0
Evaluate docker overlay network performance influence on high volume setups
#13
let4be
closed
3 years ago
1
Consider adding bind9 to docker-compose
#12
let4be
closed
3 years ago
1
Ensure proper stopping sequence on SIGTERM
#11
let4be
closed
3 years ago
0
Think about the best way to support ipv6
#10
let4be
closed
2 years ago
0
Check channel buffer sizes
#8
let4be
closed
3 years ago
0
Concurrent writing to clickhouse
#7
let4be
closed
3 years ago
0
Implement faster HTML parsing
#6
let4be
closed
3 years ago
1
Queue sharding support
#4
let4be
closed
3 years ago
1
Provide example configurations for various setups
#3
let4be
closed
2 years ago
2
Migrate job management system to Redis
#2
let4be
closed
3 years ago
0
Add WARC support
#1
let4be
opened
3 years ago
0