sourcegraph / sourcegraph-public-snapshot

Code AI platform with Code Search & Cody
https://sourcegraph.com
Other
10.1k stars 1.28k forks source link

search: slow time to first result for diff #18142

Open keegancsmith opened 3 years ago

keegancsmith commented 3 years ago

The 500 limit on type:diff gives a time to first result of 5 seconds (long) and takes longer to finish (11 seconds). Time to first result of count:30 is very fast. Is this a bug?

Query

Originally posted by @rvantonder in https://github.com/sourcegraph/sourcegraph/issues/18060#issuecomment-776180992

keegancsmith commented 3 years ago

Turns out the issue was our gitserver was streaming out output for git log, but not telling the http clients that it would stream.

keegancsmith commented 3 years ago

This is still happening on Sourcegraph.com. Re-opening.

keegancsmith commented 3 years ago

This will require more investigation. I'm not sure why gitserver is not streaming here. When I ran the command manually git is clearly streaming the output in my terminal:

git log --no-prefix --max-count=501 --unified=0 --extended-regexp --regexp-ignore-case -z --format='%H %S' --no-color --no-patch --no-merges -GSearch --regexp-ignore-case --pickaxe-regex --extended-regexp --

cc @stefanhengl

keegancsmith commented 3 years ago

The root cause of this in streaming is the higher default limit of 500 (vs 30). With the lower limit it seems git or our interaction with git is much faster.

This requires a bit more work. I think I remember reading some investigation by @camdencheek or @stefanhengl that we can probably simplify commit search so it doesn't pipe the output of one git command into another. Alternatively we can do more debugging to understand if there are stalls in the pipeline or git when there is a higher limit.

For now I have removed myself as the assignee and added it to search-core's backlog.