Closed clydebarrow closed 1 month ago
👋 hey @clydebarrow
The ranking here does seem worse than I would expect by default, though it is indexing the headings, they're just being lost in the soup of results a bit.
This is where tweaking the ranking parameters would help. I can see in the site source you have configuration for this, but it's currently not being applied since it isn't inside a ranking
object. Currently you have:
window.addEventListener('DOMContentLoaded', (event) => {
new PagefindUI({
element: "#search",
showSubResults: true,
pageLength: 0.0,
termSaturation: 0.8,
termFrequency: 0.4,
termSimilarity: 1.0
});
});
But that should be:
window.addEventListener('DOMContentLoaded', (event) => {
new PagefindUI({
element: "#search",
showSubResults: true,
ranking: {
pageLength: 0.0,
termSaturation: 0.8,
termFrequency: 0.4,
termSimilarity: 1.0
}
});
});
Additionally, from a quick test, these parameters seem to do a little better:
pageLength: 0.0,
termSaturation: 1.6, // raised this value to favor the high-density pages
termFrequency: 0.4,
termSimilarity: 6.0 // raised this value to trim out some shorter word stems from muddying results
but it's currently not being applied since it isn't inside a
ranking
object
Reminds me again why I dislike Javascript :-(
So with that fixed and your suggestions applied, it now seems to be working well. Thanks!
I'm trying to implement Pagefind on a site to improve the search, and it does not seem to be indexing the content of H1 elements.
The site is browsable here:
https://esphome-docs.web.app/index.html
The screenshot below shows searching for "Automations and Templates" and it is clearly ignoring the H1 header on the current page.
According to the docs, H1 headers should rank well above body text.
I suspect, though have not confirmed, that other Hx elements are also not being indexed.
I'm using Pagefind 1.1.0.