sourcegraph / zoekt

Fast trigram based code search
Apache License 2.0
736 stars 83 forks source link

shards: only trigger rescan on .zoekt files changing #801

Closed keegancsmith closed 3 months ago

keegancsmith commented 3 months ago

Any write to the index dir triggered a scan. This means on busy instances we are constantly rescanning, leading to an over-representation in CPU profiles around watch. The events are normally writes to our temporary files. By only considering events for .zoekt files (which is what scan reads) we can avoid the constant scan calls.

Just in case we also introduce a re-scan every minute in case we miss an event. There is error handling around this, but I thought it is just more reliable to call scan every once in a while.

Note: this doesn't represent significant CPU use, but it does muddy the CPU profiler output. So this makes it easier to understand trends in our continuous cpu profiling.

Test Plan: CI