issues
search
internetarchive
/
Zeno
State-of-the-art web crawler 🔱
GNU Affero General Public License v3.0
83
stars
11
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[STALE] Split Zeno in smaller packages with a better structure
#122
equals215
closed
1 week ago
3
Queue all items from seeds list before starting to crawl
#121
CorentinB
closed
3 months ago
0
Calming down queue stats
#120
yzqzss
closed
3 months ago
0
Change default queue behaviour
#119
equals215
closed
3 months ago
0
WAL tests fail
#118
CorentinB
closed
3 months ago
3
Enable `log` package to distribute a stored logger to all other packages
#117
equals215
closed
3 months ago
3
Lock-free AwaitWALCommitted (and smoother queue?)
#116
yzqzss
closed
3 months ago
0
Define Zeno's queuing behavior properly
#115
CorentinB
closed
3 months ago
2
Fix idna.ToASCII fail on punycode encoded URLs with port
#114
CorentinB
closed
3 months ago
0
Transform Zeno architecture to a crawling pipeline effectively making use of Go channels
#113
equals215
opened
3 months ago
0
Break down Zeno in smaller packages, especially `crawl` package which has grown too big
#112
equals215
opened
3 months ago
0
Reuse free space from popped items
#111
equals215
opened
3 months ago
3
fix: hanging on indexManager.Close()
#110
yzqzss
closed
3 months ago
2
Improve WAL concurrency performance by @yzqzss and make it optional
#109
equals215
closed
3 months ago
13
SIGSEGV logging in BatchEnqueue
#108
CorentinB
closed
3 months ago
3
Using group commit to improve WAL concurrency performance
#107
yzqzss
closed
3 months ago
1
Queue handover v2
#106
equals215
closed
3 months ago
0
Remove `runtime.Gosched()` in polling
#105
yzqzss
closed
3 months ago
1
Optimize `get list` loading performance
#104
yzqzss
closed
4 months ago
0
--exclude-host not found
#103
CorentinB
closed
4 months ago
0
Fix readItemsFromQueue() CPU 100%, fix various data races, fix hang when all items are deduplicated
#102
yzqzss
closed
4 months ago
5
Have `queue.Enqueue()` handover items to idle workers and optimize workers routines
#101
equals215
closed
4 months ago
5
Implement host rotation and Enqueue/Dequeue access regulation via atomic booleans
#99
equals215
closed
4 months ago
2
Add dequeue enqueue stats
#98
CorentinB
opened
4 months ago
2
create url_string_test.go
#97
willmhowes
opened
4 months ago
2
Restore HQ flags
#96
CorentinB
closed
4 months ago
0
Implement linkheader parsing
#95
HarshNarayanJha
closed
4 months ago
5
Queue and Index should reuse free space
#94
equals215
opened
4 months ago
0
Persist & load queue stats
#93
CorentinB
closed
4 months ago
0
Add logging capabilities for queue (index too) using custom `log` package
#92
equals215
closed
3 months ago
1
Fix commit hash in User Agent can't be calculated when not present
#91
CorentinB
closed
4 months ago
0
Instantiate a `CODE_OF_CONDUCT.md` as the repo drags some traction
#90
equals215
opened
4 months ago
0
Panic when starting Zeno with go run
#89
CorentinB
closed
4 months ago
0
Change log location
#88
nick2432
closed
3 months ago
7
Extract URLs from ebook formats (EPUB, MOBI..)
#87
CorentinB
closed
1 week ago
0
Extract URLs from images
#86
CorentinB
opened
4 months ago
2
Replace github.com/tomnomnom/linkheader with stdlib
#85
CorentinB
closed
4 months ago
10
Replace github.com/clbanning/mxj/v2 with stdlib
#84
CorentinB
closed
2 weeks ago
4
Revamp index mechanism with a WAL
#83
equals215
closed
4 months ago
4
pprof API expose silently fail when port is used
#82
CorentinB
opened
4 months ago
0
Replicate the active workers count for `--live-stats` on the new version of workers
#81
equals215
closed
3 months ago
1
Fix workers not stopping properly and temp fix for workers hanging unexpectedly
#80
equals215
closed
4 months ago
1
Fix crash when --api is not set
#79
NGTmeaty
closed
4 months ago
1
Rewriting the queue
#78
CorentinB
closed
3 months ago
11
Fix live-stats
#77
CorentinB
closed
4 months ago
0
Add deduping stats
#76
CorentinB
closed
4 months ago
0
Remove Gin dependency
#75
CorentinB
closed
4 months ago
1
Add basic UI to manage Zeno
#74
CorentinB
opened
4 months ago
0
Logs are written to the wrong location
#73
willmhowes
closed
3 months ago
4
live-stats flag is broken
#72
willmhowes
closed
4 months ago
1
Previous
Next