algolia / docsearch-scraper

DocSearch - Scraper
https://docsearch.algolia.com/
Other
309 stars 107 forks source link

Add content snippet for lvl-n heading match #548

Open mojavelinux opened 3 years ago

mojavelinux commented 3 years ago

When a query matches a heading, the snippet result is empty. For example, search for "queryHook" in docsearch's own documentation: https://docsearch.algolia.com/search?q=queryHook Now compare that to the search for "queries algolia": https://docsearch.algolia.com/search?q=queries+algolia

The lack of any sort of content preview in the suggestion from the first example makes it seem less relevant than a suggestion that includes a snippet result. I think the user (myself included) expects to see some sort of context.

Perhaps the scraper can take the first several words below a heading and store those in the content. Another solution would be to allow the config to match a description for the page, which could then be used in the attributesToSnippet as a fallback. (I realize I could also populate the description using a meta tag). However, the description for the page might come across as too generic for a heading further down. A snippet taken from the content below the heading would make much more sense.

shortcuts commented 3 years ago

Thanks for the feedback @mojavelinux.

Is your request only for the search page or the DocSearch modal v2/v3?

We try to categorize the results of the modal so the user can instantly differentiate if the result match a header or its content. This is something that is improved in the current DocSearch v3 version and I think adding any further content could lead to confusion when trying to find "your match".

This is not something we plan adding, since it would involve refactoring the way we scrape/display DocSearch records, but if this is something doable for you, feel free to open a PR.

In case you didn't know, the min_indexed_level option could help in some cases (especially for DocSearch v2)