Preprocessing of frontmatter

furudean commented 2 years ago

It would be cool to run some processing on front matter before it's delivered thru layout props/and potentially stuff.

For example, I define my page descriptions in the front matter and i want to smartypants these descriptions before they're used in <meta> tags and whatnot.

---
description: It's my description--I hope you enjoy the contents.
---

Should ultimately be transformed to

It’s my description—I hope you enjoy the contents.

If you pull a smartypants implementation into your code that's extra that needs to be shipped to the browser when it could've been processed away at compile time.

There are a few other use cases for this, @pngwn noted that some want to put actual markdown inside the frontmatter too which would require similar considerations.

My proposed solution

Add a new mdsvex option that allows to to run custom transforms on the frontmatter before it's delivered elsewhere.

mdsvex({
    preprocessFrontMatter: ({key, value}) => { ... }
})

Open questions

You might want different behavior for different kinds of files, so knowing which file is being preprocessed might be useful. How should this be passed?
You also might want to reuse the parser you have defined with all of its plugins etc, to be run that over the value. This does not currently fit into the proposed solution.

pngwn commented 2 years ago

Part of my feels like 'layouts' or a similar kind of abstraction offers a nice mechanism for something like this.

Say you have a blog with recipes and posts.

In posts you have an iso date:

date: 2022-05-27

which should become: 27th May 2022

but in recipes you have:

ingredients:
 - 10g cheese
 - 200g flour
 - ...

which you want to parse + transform into a more complex data structure.

The layouts themslves could provide yaml preprocessors. But this flies in the face of leaving layouts for sveltekit or related build tools and integrating directly with them.

I feel like there is something here. Maybe we could add a more generic type abstraction that mdsvex understand by default to apply different kinds of config to different 'types' opf documents. type could be a standard key in the frontmatter in the same way that layout is now. We could then use this for different purposes.

We could even go so far as allowing different 'types' to have totally different configs with different plugins etc:

mdsvex({...config})

or:

msvex([
  { 
    type: 'blog',
    ...config 
  },
  { 
    type: 'recipe',
    ...config
  }
])

hmmm

Chaostheorie commented 2 years ago

For preprocessing have you tried implementing a custom parser function and then adding preprocessing as needed?

For example to support TOML (and add frontmatter as a prop to files) as a frontmatter language:

import { defineMDSveXConfig as defineConfig } from 'mdsvex';
import { parse as load } from '@iarna/toml';
...

const config = defineConfig({
    ...
        // frontmatter parsing options: https://mdsvex.pngwn.io/docs#frontmatter
    frontmatter: {
        marker: '+', 
        type: 'toml',
        parse(frontmatter, messages) {
                        // frontmatter is in string format at this point. You're free to preprocess it as you see fit and/ or add special configs with, e.g., tags.
            try {
                let fm = load(frontmatter);
                return { fm: frontmatter, ...fm };
            } catch (e) {
                messages.push(e.message);
            }
        }
    },
    ...
});

export default config;

pngwn commented 2 years ago

You can also write a custom remark plugin, the frontmatter data exists on the vFile and can be modified by plugins. Would still be nice to have a clean API for this in v1 (there will be no remark then).

nosovk commented 2 years ago

I would like to import image relative to article, but using imageteools for preprocessing.

It could be achieved by adding script to each blog post manually. But it's hard for content editors:

<script context="module">
  import img from "./img.png?format=webp;jpg;png;avif&srcset";
  import thumbnail from "./img.png";

  metadata.image= img;
  metadata.thumbnail = thumbnail;
</script>

Would be much better to move that code to layout.svelte. Then we would need only one formatter line: thumbnail: ./img.png

Is there a way to make dynamic import of image that stored in formatter?

nosovk commented 2 years ago

Thanks to that article I found metadata object, which helps to solve the name overlap problem.

Here is examples of two remark plugins:

https://github.com/MailCheck-co/mailcheck.site/commit/3ed076a148ba9624a7ddc7abda7c27821035871d - it's a simple one, we create slug field if it wasn't present in formatter. It could be adapted to do any simple modifications in formatter.
https://github.com/MailCheck-co/mailcheck.site/pull/1074/commits/f4e002edb1676236394fccf2b3f677acdb71bf29 - this is a complex one. If we want to process formatter field via imagetools we have to import image path. It's done by hack, but it actually works, thanks to article above.

nosovk commented 2 years ago

@pngwn being compatible with remark is a nice feature, big ecosystem of simple plugins is always nice. I hope in v1 some API for such hacks as above will persist.

furudean commented 2 years ago

I managed to solve my issue using the frontmatter option but I'm leaving this open for the discussion.

HopesDad commented 2 years ago

Thank you for good information.

artemkovalyov commented 1 year ago

You can also write a custom remark plugin, the frontmatter data exists on the vFile and can be modified by plugins. Would still be nice to have a clean API for this in v1 (there will be no remark then).

Oh Man, I should have read this earlier. Would save me a couple of hours. I was updating yaml in the tree but didn't see any changes on the imported modules. Adding this data to vFile solved it.

import { visit } from 'unist-util-visit';
import getReadingTime from 'reading-time';

export default function readingTime(options = { wordsPerMinute: 200 }) {
  return (tree, file) => {
    let text = '';
    visit(tree, ['text', 'code'], (node) => {
      text += node.value;
    });

    const readingStats = getReadingTime(text, options);
    file.data.fm = { ...file.data.fm, ...readingStats };
    console.log(file.data);
    // visit(tree, ['yaml'], (node) => {
    //   node.value +=
    //     `\nreadingTime: ${readingStats.minutes}\n` + `wordsCount: ${readingStats.words}\n`;
    //   console.log(node);
    // });
    visit(tree, ['yaml', 'toml'], (node) => {
      node.value +=
        `\nreadingTime: ${readingStats.minutes}\n` + `wordsCount: ${readingStats.words}`;
      // console.log(node);
    });
  };
}

I can add docs somewhere if you'd like. This is a kinda implicit convention/API that I couldn't grasp.

pngwn / MDsveX

Preprocessing of frontmatter #455

My proposed solution

Open questions