Open marcuswhybrow opened 2 months ago
import { parse } from "node-html-parser"; // JSDOC was running out of memory for me, and slower.
/**
* Parses data-pagefind-filter attributes to exctract all Pagefind filters.
*
* Pagefind's node wrapper lib doesn't say which filters it discovered, forcing
* filter lookup in the client lib, slowing time to first render of filters.
* This functions gets around that by parsing the HTML again ourselves looking
* for the same pagefind HTML element attributes which pagefind itself does to
* reconstruct the same data that [pagefind.filters()] would return.
*
* This is an upcomming feature of Pagefind, so this approach will soon be
* obsolete. See reference issues. This implementation is a best guess effort
* following the Pagefind docs, it may not perfectly match edge cases in filter
* names or values.
*
* # Reference
* - https://pagefind.app/docs/filtering/
* - https://github.com/CloudCannon/pagefind/issues/715
* - https://github.com/CloudCannon/pagefind/issues/371
*
* # Example
* ```js
* import assert from "assert";
* assert.deepEqual(extractPagefindFilters(`
* <span data-pagefind-filter="singleName:inlineContent"></span>
* <span data-pagefind-filter="singleName">valueContent</span>
* <span data-pagefind-filter="name1, name2:inlineContent">valueContent</span>
* <span data-pagefind-filter="name1, name2[data-name], name3:inlineContent" data-name="attrValue">valueContent</span>
* `), {
* singleName: { inlineContent: 1, valueContent: 1 },
* name1: { valueContent: 2 },
* name2: { inlineContent: 1, attrValue: 1 },
* name3: { inlineContent: 1 }
* }
* ```
*
* @param {string} html
* @returns {object} For brevity "object" is substituted for a proper PagefindFilters type.
*/
export function extractPagefindFilters(html) {
const pagefindFilters = {};
parse(html).querySelectorAll("[data-pagefind-filter]").forEach(element => {
let signature = element.getAttribute("data-pagefind-filter");
if (!signature) return;
let filters = [];
let chars = signature.split("");
let name = "";
let mod = ""; // the "[attr-name]" or ":inline content" after the filter name, I'm calling a modifier
chars.forEach(char => {
switch (char) {
case ',':
if (mod[0] === ":") mod += char;
else {
filters.push([name, mod]);
name = ""; mod = "";
}
break;
case '[':
case ':':
mod += char;
break;
case ']':
default:
if (mod) mod += char;
else name += char;
}
});
if (name || mod) filters.push([name, mod]);
filters = filters.map(([name, mod]) => {
name = name.trim();
mod = mod.trim();
if (mod[0] === ":") {
return [name, mod.substring(1).trim()];
} else if (mod[0] === "[") {
return [name, element.getAttribute(mod.substring(1, mod.length - 1))?.trim() || ""];
} else {
return [name, element.textContent?.trim() || ""];
}
});
filters.forEach(([name, value]) => {
if (!pagefindFilters.hasOwnProperty(name)) pagefindFilters[name] = {};
if (!pagefindFilters[name].hasOwnProperty(value))
pagefindFilters[name][value] = 1;
else pagefindFilters[name][value]++;
});
});
return pagefindFilters;
}
I'm building a web UI using Pagefind and it's going great. I'm using the node wrapper API's
addHTMLFile
function to populate the index.I'm currently displaying the entire list of Pagefind filters to the user on page load by awaiting
pagefind.filters()
. This leads to the filters visually popping in once the promise resolves.What I'd prefer to do is ask the the node index to report the aggregated filters at compile time, which I could then bake into my HTML to prevent content popping.