Open bahrmichael opened 6 months ago
We’ve been struggling to get Code Insights to work at Sentry. the screenshot below is what we see in the UI. From the screenshot it seems like we hit a timeout. Which timeout is it referring to and can that be configured?
I assume that the datapoints come from the backfill process.
Unfortunately this is a bit hard to pin point, since there are multiple timeouts. The one that we're seeing is most likely coming from a downstream service, or an error falsely being interpreted as a timeout. When I finished cloning chromium I should be able to reproduce some timeout errors and find the right places in the code that are causing this.
It’s not entirely clear which log to check. we checked searcher (no warnings/errors during the time period this was being generated) and gitserver (also no seemingly relevant warning/error messages). we did find some relevant errors/warnings in frontend
From what I'm seeing locally, the logs should be in the worker
with the scope worker.insights-job.background.Handle
. Here I can see error messages like "error during backfill execution", "insights backfill progress handler loaded", "backfill set to completed state", "backfill disabled due to exceeding error threshold", and others. You need to set the log level to INFO to see all of them.
A log line like the one below should tell us which downstream this error may be coming from.
[ worker] ERROR .../backfill_state_inprogress_handler.go:206 error during backfill execution ... "handle": {"seriesId": 12, "seriesUniqueId": "2gBFsFt6a1A5DmoRJTbJKLfsCex", "backfillId": 6, "totalDuration": "3m41.207475831s", "repoTotalCount": 17, "errorCount": 8, "percentComplete": 0.8235294117647058, "erroredRepos": 1, "error": "searchHandler: streaming search: errors: [rpc error: code = Unknown desc = hybrid search failed: iterating over changed files in git diff: rpc error: code = Unknown desc = failed to get changed file: encountered unknown file status \"T\" for file \"internal/telemetrygateway/v1/telemetrygateway.proto\"]\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled\nrateLimiter.Wait: context canceled"}}
Based on the line below from your logs I guess that the search limits are stopping us here. They are not configurable in the currently released version.
{"SeverityText":"ERROR","Timestamp":1712170603579550779,"InstrumentationScope":"Resolver.live_preview_resolver","Caller":"query/streaming_query_executor.go:154","Function":"github.com/sourcegraph/sourcegraph/internal/insights/query.(*StreamingRepoQueryExecutor).ExecuteRepoList","Body":"repo search encountered skipped events","Resource":{"service.name":"frontend","service.version":"5.2.7","service.instance.id":"da92ab9c4de7"},"Attributes":{"reasons":"[shard-match-limit: Not all results have been returned due to hitting a match limit. Sourcegraph has limits for the number of results returned from a line, document and repository.]","query":"fork:yes archived:yes repo:.* count:21"}}
I'm curious about the count:21
, and if the query
is the full query we're using in the code insight. I've seen some places in the code where it checks against the given limit, and aborts if there are more entries.
What still sounds odd to me is that shard-match-limit
would be translated to a timeout.
hybrid search failed: iterating over changed files in git diff: rpc error: code = Unknown desc = failed to get changed file: encountered unknown file status \"T\" for file \"internal/telemetrygateway/v1/telemetrygateway.proto\"
This seems like a bug. It looks like we only recognize file statuses A, M, and D, but according to man git-diff-tree
, there are various other valid statuses.
Possible status letters are:
• A: addition of a file
• C: copy of a file into a new one
• D: deletion of a file
• M: modification of the contents or mode of a file
• R: renaming of a file
• T: change in the type of the file (regular file, symbolic link or
submodule)
• U: file is unmerged (you must complete the merge before it can be
committed)
• X: "unknown" change type (most probably a bug, please report it)
FYI @camdencheek I copied the previous text to a local file as I iterate on the problem and get closer to the root cause. Just in case you're wondering where it just went.
Thanks! I went ahead and poked the Source team here about the diff iterator bug
To help with excluding repos that cause problems I added this issue: https://github.com/sourcegraph/sourcegraph/issues/62578
This ticket originated from a Slack message, reported by @mike-r-mclaughlin.
Logs: https://sourcegraph.slack.com/files/U01H4LDHVMF/F0703Q1EU2X/sourcegraph-frontend-0_--tail_100.txt