kubernetes / website

Kubernetes website and documentation repo:
https://kubernetes.io
Creative Commons Attribution 4.0 International
4.47k stars 14.38k forks source link

Improve PageFind search results #47137

Open cjyabraham opened 3 months ago

cjyabraham commented 3 months ago

We now have PageFind search serving users who cannot access the Google Programmable Search, such as those behind the firewall in China. There have been reports that the search results are not well ranked so we'd like to improve them.

To properly assess the quality of the search results, we should find a way to measure their quality against some kind of baseline, such as what is provided by our Google Programmable Search engine. It may be best to, say, start with the 20 most common searches and then grade how suitable the results are. Grading the results over a broad range of search terms will ensure we're not optimizing things for just one or two particular use-cases.

This work should be done by someone who is familiar with the Kubernetes docs and knows what results would be best served for a particular query. Once we see where things are now, we can tune the PageFind results to see if we can improve their score to an acceptable level.

nate-double-u commented 3 months ago

/triage accepted

nate-double-u commented 3 months ago

/area web-development

dipesh-rawat commented 3 months ago

/area web-development

/priority important-soon (Please feel free to adjust the priority as needed, if the SIG consensus leans toward a different priority)

TPXP commented 3 months ago

Ideally, the search should also understand the same aliases as kubectl (svc -> service,....). Google handles some of the aliases but not all of them

By the way, starting with entity types (pod, ingress, service...) and making sure the first result is the page presenting may be a great start

sftim commented 1 month ago

Ideally, the search should also understand the same aliases as kubectl (svc -> service,....). Google handles some of the aliases but not all of them

By the way, starting with entity types (pod, ingress, service...) and making sure the first result is the page presenting may be a great start

I think that's a great idea but could be its own feature request @TPXP

sftim commented 1 month ago

We're not staffing this, so: /remove-priority important-soon /priority important-longterm

People accessing our docs from behind state censorship may have other search options available, other than our built-in search.