CloudCannon / pagefind

Static low-bandwidth search at scale
https://pagefind.app
MIT License
3.34k stars 100 forks source link

Proper navigation to urls that are external #468

Closed Enigmatrix closed 10 months ago

Enigmatrix commented 10 months ago

I'm working on search with links to sites that are not my own HTML content (specifically, they are links to external sites) and I can index them using the Pagefind NodeJS API as such:

extUrl = "https://example.com/exams"
await index.addHTMLFile({
  url: extUrl,
  content: await fetch(extUrl).then((res) => res.text()),
});

However, while using default-ui search to find known keywords of this URL and navigating to the site, the target URL gets clobbered as https://mysite/https:/example.com/exams. This doesn't seem to be an issue on the pagefind Rust binary side, but the default-ui side (verified using my debugger). It uses fullUri to get the URL:

https://github.com/CloudCannon/pagefind/blob/9740deb41d1ba997b21bc65449c79dcced974427/pagefind_web_js/lib/coupled_search.ts#L248-L279

and this clobbers it. An easy fix would be to detect if the URL already starts with https / http and just leave it alone.

bglw commented 10 months ago

Ah, good call. Will fix.

bglw commented 10 months ago

Pending the next release, I think you would be able to use the Default UI's processResult hook to unclobber it, if that helps!

Enigmatrix commented 10 months ago

ok thanks for the workaround 👍 , will await the fix as well.