unjs / nitro

Next Generation Server Toolkit. Create web servers with everything you need and deploy them wherever you prefer.
https://nitro.unjs.io
MIT License
5.84k stars 490 forks source link

Allow prerender crawler find more types of links #1067

Open d0rich opened 1 year ago

d0rich commented 1 year ago

Describe the feature

Problem

Currently, prerender crawler is allowed to check only html and json links: https://github.com/unjs/nitro/blob/fd15c212b197bb035d675ce05c9722e1099bcf33/src/prerender.ts#L18

const allowedExtensions = new Set(["", ".json"])

But sometimes it is also needed to prerender binary files with different extentions (eg pdf, png).

My case

My case requires prerender pdf files, and it is quite hard to create routes manually. Moreover in Nuxt 3 I also can't add prerender routes generated from Nuxt Content.

Possible solution

Solution here can be adding new extentions to allowed or even deprecating filtration: https://github.com/unjs/nitro/blob/fd15c212b197bb035d675ce05c9722e1099bcf33/src/prerender.ts#L258

if (crawlLinks) {
  _links.push(
    ...[...html.matchAll(LINK_REGEX)]
      .map((m) => m[1])
      .filter((link) => allowedExtensions.has(getExtension(link)))
  );
}

Extend functionality

Now all links are crowled with Regex detecting href="somelink. But it also can be useful to detect sources of images: it also can be generated. In this case it is needed to change Regex: https://github.com/unjs/nitro/blob/fd15c212b197bb035d675ce05c9722e1099bcf33/src/prerender.ts#L242

const LINK_REGEX = /href=["']?([^"'>]+)/g;

Additional information

semiaddict commented 1 year ago

@d0rich,

Moreover in Nuxt 3 I also can't add prerender routes generated from Nuxt Content.

I believe you actually can by using serverQueryContent (undocumented) in a nitro plugin. I am actually doing so by fetching content in a nitro render:response hook and then adding the extra routes in the x-nitro-prerender header (also seems to be undocumented).

Let me know if you need for details.