CloudCannon / pagefind

Static low-bandwidth search at scale
https://pagefind.app
MIT License
3.34k stars 100 forks source link

Performance using the Node API #461

Closed oscarotero closed 9 months ago

oscarotero commented 10 months ago

Hi. I noticed a huge performance difference between CLI and NodeJS APIs.

Lume uses the JS API to index the pages before saving them into the disk, as you can see here.

A site with less than 600 HTML files takes about 4 minutes in my computer to be indexed. I noticed the first pages are indexed quickly but every new page added takes a bit longer than previous.

Using the CLI API, the site is indexed in just 2-3 seconds.

Is there any way to improve this? Perhaps a function to send all pages to be indexed at the same time? For example:

await index.addHTMLFiles([
  {
    url: "/page1/",
    content: "content1",
  },
  {
    url: "/page2/",
    content: "content2",
  },
  // etc
]);
bglw commented 10 months ago

Will look at this week. (Some discussion in Lume discord)

bglw commented 10 months ago

Found the first bug causing the biggest issue. If you update to v1.0.4-rc0 performance should be way way way better. Will look at the other tasks later, but they should be (less) critical now.

bglw commented 9 months ago

Released as stable in v1.0.4 🎉