The Search Admin Experience PRFAQ discusses a feature we named "search analytics" (the name isn't final): a new feature that admins can use to see:
what kinds of searches are their users running
who are the heaviest users of search
how long are search requests taking? Are they completing successfully or failing?
Is there a pattern to the kind of searches that are failing?
This idea came from our experience debugging search-related issues with our customers. Some users would complain that search "feels slow" or sometimes fails for the thing they're searching for - but it'd be hard to reason why this is the case.
We capture performance data in our Grafana dashboards (see the "search requests" section), but that data is aggregated across all searches. Unless there is a broad performance problem with all of search, the performance issue might not show up or be impossible to drill into further.
On sourcegraph.com, Honeycomb is the service we use to do that drilling. With it we can answer questions like: "Which user is running the most expensive search on sourcegraph.com? What other searchers has this user run? Are they legitimate, or are they bot-like (and should be banned)?"
Honeycomb is a paid service, so we can't expect our customers to use it. The idea behind "search analytics" is to ship a lightweight version of Honeycomb that could help us debug these issues better.
Honeycomb is a complex tool with neat charting features, but we'd target something simple for the MVP. We'd want to show the 20 or so longest-running searches over the last 24 hours in a view similar to the "raw data" view in honeycomb: a filterable table that's visible in the site admin area. Each row would have fields like:
The query itself
The "actor": userID of the person that made the request
timestamp
The amount of time that the search request took
Whether or not the search was successful
The number of matches that the searches found, etc.
Some natural extensions to the UI could include:
having more than 20 elements
allowing people to filter on just the searches for a particular user
allowing people to choose a larger or smaller time window
having people be able to change the sort order
incorporating tracing somehow with each search request
Part of https://github.com/sourcegraph/pr-faqs/issues/17
The Search Admin Experience PRFAQ discusses a feature we named "search analytics" (the name isn't final): a new feature that admins can use to see:
This idea came from our experience debugging search-related issues with our customers. Some users would complain that search "feels slow" or sometimes fails for the thing they're searching for - but it'd be hard to reason why this is the case.
We capture performance data in our Grafana dashboards (see the "search requests" section), but that data is aggregated across all searches. Unless there is a broad performance problem with all of search, the performance issue might not show up or be impossible to drill into further.
On sourcegraph.com, Honeycomb is the service we use to do that drilling. With it we can answer questions like: "Which user is running the most expensive search on sourcegraph.com? What other searchers has this user run? Are they legitimate, or are they bot-like (and should be banned)?"
Honeycomb is a paid service, so we can't expect our customers to use it. The idea behind "search analytics" is to ship a lightweight version of Honeycomb that could help us debug these issues better.
Honeycomb is a complex tool with neat charting features, but we'd target something simple for the MVP. We'd want to show the 20 or so longest-running searches over the last 24 hours in a view similar to the "raw data" view in honeycomb: a filterable table that's visible in the site admin area. Each row would have fields like:
Some natural extensions to the UI could include: