ProjectEvergreen / greenwood

Greenwood is your full-stack workbench for the web, focused on supporting modern web standards and development to help you create your next project.
https://www.greenwoodjs.io
MIT License
94 stars 9 forks source link

prerendering breaks optimization configuration with multi-line formatted `<script>` / `<style>` / `<link>` tags #1241

Closed thescientist13 closed 1 week ago

thescientist13 commented 1 month ago

Summary

So this was an interesting development encountered while working on the website, wherein when using Prettier, it broke up a lengthy <script> tag over multiple lines

<script
  type="module"
  src="../components/latest-post/latest-post.js"
  data-gwd-opt="static"
></script>

This resulted in the script, marked as static not being removed from the final page output, thus showing a 404 in the browser console. Screenshot 2024-06-05 at 8 40 42 AM

Details

Digging into it, this is because of how the optimizing logic works in Greenwood (I will be the first to admit it is a bit of naive implementation 😅 ) in that when searching for tags that need optimization applied, it basically just looks to match on the existing formatting, to find / replace against that.

Below is just one example for how <script> tags are managed. https://github.com/ProjectEvergreen/greenwood/blob/master/packages/cli/src/plugins/resource/plugin-standard-html.js#L279

if (type === 'script') {
  if (optimizationAttr === 'static' || optimization === 'static') {
    body = body.replace(`<script ${rawAttributes}>${contents.replace(/\.\//g, '/').replace(/\$/g, '$$$')}</script>`, '');
  } else if (optimizationAttr === 'none') {
    // ...
  }
}

And so the issue becomes, if the formatting changes at all from the original authored HTML to what eventually gets to Greenwood by this stage, that matching will fail. And so in this case, with prerender enabled, after going through WCC, the HTML is now "formatted" and now there are no line breaks anymore, and thus the matching fails.

<script type="module" src="../components/latest-post/latest-post.js" data-gwd-opt="static"></script>

Screenshot 2024-06-05 at 8 40 20 AM


While this issue is going to be scoped just to fixing the immediate issue for optimization handling, this is probably a good sign we should audit the rest of our code and try and adopt a more programmatic option using our HTML parsing library, which I did try, but couldn't get something like this to work (e.g. likey just a skill issue 🙃 ).

let body = await response.text();

const root = htmlparser.parse(body, {
  script: true,
  style: true
});

const tag = root.querySelectorAll('script').find(script => script.getAttribute('src') === src);
const optimized = '';

root.replace(tag, optimized);

console.log(root.outerHTML); // should now have the <script> tag removed!  

Will open a new issue for this specifically though.