CloudCannon / pagefind

Static low-bandwidth search at scale
https://pagefind.app
MIT License
3.22k stars 97 forks source link

Allow parsing for otherwise ignored elements #542

Open JulianCataldo opened 5 months ago

JulianCataldo commented 5 months ago

Hello!

Pagefind works great, but I've a bug when using DSD. The content is overlooked by the CLI builder. No index is built.

I think it's because the underlying DOM parser is skipping <template> elements altogether.

However shadowrootmode="open" should make this content discoverable, as with Chrome, Firefox or Safari does.


Not working:

            <page-content data-pagefind-body>
                <template shadowrootmode="open"> Hello </template>
            </page-content>

Working as expected:

            <page-content data-pagefind-body>
                Hello
            </page-content>
bglw commented 5 months ago

Good note; Thankfully this exists within Pagefind's codebase rather than the underlying parser, as we have <template> in the default list of ignored selectors. A lot of template

What's currently missing is a slightly richer Pagefind configuration that allows for explicitly including elements that would otherwise be ignored. Which is something that can be added — I would say both as a configuration file option, and as a data-pagefind-include attribute that can be added per-element.

JulianCataldo commented 5 months ago

Excellent! Thanks

I've switched to the programmatic Node API, better for my use case (dev server), even though it requires more setup. writeFiles is straightforward, but still trying to figure getFiles out, I've trouble serving them via the HTTP server (mime types issues…). Maybe I could file an issue for that? For more doc's examples.

bglw commented 5 months ago

Yes, open an issue on that and we can dive deeper there 🙂🙏