brimdata / zed

A novel data lake based on super-structured data
https://zed.brimdata.io/
BSD 3-Clause "New" or "Revised" License
1.34k stars 67 forks source link

Buffer zngio.scanner.workerCh for more concurrency #5103

Closed nwt closed 3 months ago

nwt commented 3 months ago

Tracing with package runtime/trace shows the scanner.start and worker.run goroutines block frequently on the workerCh channel. Add buffering to workerCH to reduce blocking and increase concurrency.

When scanning a 4 GB ZNG file containing Zeek logs, this yields a 1.1X speedup on my 10-core machine.