cake-build / website

:earth_americas: The Cake website: https://cakebuild.net
https://cakebuild.net
MIT License
43 stars 228 forks source link

Add API to search index #1100

Open pascalberger opened 4 years ago

pascalberger commented 4 years ago

API part is currently not indexed.

I currently don't see a way to index them properly with the way how API pages are structured (multiple headings as H1, title containing not only name but also type). When migrating to Statiq we should make sure that API docs can be indexed.

daveaglick commented 2 years ago

I'm hesitant to omit the <wbr> and other elements that help display headings, etc. with good wrapping upstream. And given the size of the Cake API, a client-side search may not be performant enough (though I'd be curious to find out - the client-side search in Statiq is totally rewritten and uses gzip and other strategies to make it small and fast, even for big indexes).

Ignoring the client-side option for the moment, I do have some other thoughts:

As the porting starts (like, now! yay!), I'll keep an eye on this one in relation to the Algolia search.

pascalberger commented 2 years ago

@daveaglick I think you suggestions make it more complicated than it has to be.

The <wbr> is not the issue here, as it is already handled by the crawler. We currently are using Algolia DocSearch, which works by them hosting the index and running a crawler.

The only thing we need to do is to configure how our page is structured through CSS selectors (see https://github.com/algolia/docsearch-configs/blob/master/configs/cakebuild.json). As soon as we can define selectors for the different levels of a document it will work.

The question is what we want to index for an API page. What currently already should be possible is something like this:

Level Selector Example result
lvl1 .content-header h1 MyClass class
daveaglick commented 2 years ago

AH, I think I've got you. It's more that the headings aren't semantic (multiple H1, etc.)? Yeah, that should already be fixed I think. And if not, then it's worth doing upstream since the API pages should be good semantic HTML anyway.

Update: I took a look at the pages in https://www.statiq.dev/api and we might have a little work to do here. Easiest might be to apply marker CSS classes that the selector could use to identify particular bits for indexing.