CloudCannon / pagefind

Static low-bandwidth search at scale
https://pagefind.app
MIT License
3.22k stars 97 forks source link

πŸ‘†πŸ»πŸ‘†πŸ»πŸ‘†πŸ» Multiple Indexes #576

Closed edhenderson closed 3 months ago

edhenderson commented 3 months ago

Say I have a site with an English, French and German section (built using a static site generator). They exist at /en/, /fr/ and /de/ respectively.

Can I run PageFind 3 times and create an index in each of those. That way when the search form is used on the site, it will only search and return results within that locale?

Hope this makes sense and thank you in advance.

bglw commented 3 months ago

There would definitely be a way to run the commands for that, but it shouldn't be necessary!

Pagefind automatically segments itself by language (it uses the lang atttribute on your html element to do so) β€”Β which effectively creates three internal indexes in your case.

When searching with the Default UI, it will only search within the index for the active page. If the Default UI is loaded on a page with lang="en" / lang="en-us" etc, then it will only search in the en index. The language can also be explicitly passed to the Pagefind instance in the browser to determine what is searched.

So the behavior you're describing should be Pagefind's out-of-the-box behavior β€”Β let me know if you're not seeing that! πŸ™‚

bglw commented 3 months ago

You can see a good example of this in-situ with Astro's starlight theme, which is using Pagefind's Default UI:

edhenderson commented 3 months ago

There would definitely be a way to run the commands for that, but it shouldn't be necessary!

Pagefind automatically segments itself by language (it uses the lang atttribute on your html element to do so) β€”Β which effectively creates three internal indexes in your case.

When searching with the Default UI, it will only search within the index for the active page. If the Default UI is loaded on a page with lang="en" / lang="en-us" etc, then it will only search in the en index. The language can also be explicitly passed to the Pagefind instance in the browser to determine what is searched.

So the behavior you're describing should be Pagefind's out-of-the-box behavior β€”Β let me know if you're not seeing that! πŸ™‚

This is absolutely fantastic news mate, thank you so much. 🍻

edhenderson commented 3 months ago

@bglw Sorry to reopen this. Would this work if I had something like this:

And I need an index for each node. So searching in topic2 / fr would only get me the results from that content combination?

bglw commented 3 months ago

Not by default, no.

You could run the indexer within each topic folder to get separate indexes, or you could use Pagefind's filters to filter down the topic at search time from one big index. Both paths seem good!

edhenderson commented 3 months ago

Because it's a static build, running multiple indexes means the effort would be there rather than the FE filtering one big index? If I understand correctly?

Thanks

bglw commented 3 months ago

Pros and cons each way. Multiple indexes will produce multiple JS/WASM files, which means you wouldn't benefit from them being in the cache when you change topic.

Pagefind's filtering is pretty efficient, so that would be my leaning unless your site was particularly massive. I would choose to filter for the MDN demo, for example.

edhenderson commented 3 months ago

These are great points, thanks. Typically a user won't change topics, the content will be quite direct, so Customer A might be sent to /topic1/de/ and Customer B, /topic2/en/, they won't have the ability to change topic.

The referrer (so to speak) controls where they go as its a tool for another website to use and send someone to ours. Think of it as an iframe on another site I suppose.

bglw commented 3 months ago

Gotcha, yes that sounds like a good fit for separate indexes!

Running Pagefind once in each directory will be fine, and then you'd just load the assets from /topic1/pagefind/... and topic2/pagefind/..., everything else will work as expected.

You might need to set a baseUrl in the UI config to append the topic2/ to the result URLs, but that's all.

edhenderson commented 3 months ago

So run it once for topic1, once for topic2 after the build and let the language filtering you mentioned above work the magic?

Thanks again.

bglw commented 3 months ago

Correct! Best of luck!

edhenderson commented 3 months ago

Genuinely appreciate the directions here. I will endeavour to pass this goodwill on elsewhere. Cheers.