go-graphite / go-carbon

Golang implementation of Graphite/Carbon server with classic architecture: Agent -> Cache -> Persister
MIT License
801 stars 126 forks source link

Speeding up file walk #589

Closed deniszh closed 2 weeks ago

deniszh commented 3 weeks ago

That's similar to my old PR https://github.com/go-graphite/go-carbon/pull/329/ but I'm using github.com/charlievieth/fastwalk which is still updating instead of cwalk which is 4 years old.

Why it's needed? On really big and powerful servers with many metrics filewalk is slow. I tried WalkDir - it's faster nowadays, but fastwalk is what really gives you performance gain.

For example, for little over 55M metrics, file_scan_runtime was 28302 seconds, after this change - 2069 seconds.

filewalk is sane with number of workers, it's minimum 4, then equal to numcpu but not more than 32.

deniszh commented 2 weeks ago

OK, probably I'll close this for now. Reasion - concurrent filewalk creates too much contention on trie index, which was not designed for parallel inserting. I tried to isolate it with mutex as a whole, but looks like it's too much contention. So, would put that aside, maybe return to it soon, when optimize indices