The first bit of data I am getting back indicates this strategy of limiting the number of ngrams we lookup isn't working. I am still experimenting with different limits, but in the meantime it is easy to implement a strategy which picks a random subset. This is so that the first N ngrams of a query aren't the only ones being consulted.
Test Plan: ran all tests with the envvar set to 2. I expected tests that assert on stats to fail, but everything else to pass. This was the case.
SRC_EXPERIMENT_ITERATE_NGRAM_LOOKUP_LIMIT=2 go test ./...
The first bit of data I am getting back indicates this strategy of limiting the number of ngrams we lookup isn't working. I am still experimenting with different limits, but in the meantime it is easy to implement a strategy which picks a random subset. This is so that the first N ngrams of a query aren't the only ones being consulted.
Test Plan: ran all tests with the envvar set to 2. I expected tests that assert on stats to fail, but everything else to pass. This was the case.
SRC_EXPERIMENT_ITERATE_NGRAM_LOOKUP_LIMIT=2 go test ./...
Part of https://linear.app/sourcegraph/issue/CODY-3029/investigate-performance-of-guardrails-attribution-endpoint