zce / velite

Turns Markdown / MDX, YAML, JSON, or others into app's data layer with Zod schema.
http://velite.js.org
MIT License
341 stars 19 forks source link

Support files with multiple documents (jsonc, yaml) #178

Closed kachkaev closed 2 days ago

kachkaev commented 2 days ago

👋 @zce! I am experimenting with your lib after using contentlayer and contentlayer2 – it looks awesome! 🤩

It’s great that one can define custom loaders, and I wonder if velite’s API can allow for multiple documents in a single file. Here are the examples that I have in my mind:

JSONS

{ "foo": 1, "bar": "hello" } 
{ "foo": 2, "bar": "hi" } 
{ "foo": 3, "bar": "hola" } 

YAML with multiple documents

---
foo: 1
bar: hello

---
foo: 2
bar: hi

---
foo: 3
bar: hola

Storing multiple documents in one file can be beneficial when we have many small data records. For example, imagine a list of towns with names and geo coordinates. It might be a good idea to store all of them them in one big yaml or maybe split these yamls by country. However, having hundreds of town files with just a couple of rows in each would make it harder to maintain this data collection.

I checked the implementation of custom loaders but could not figure out how to pull multiple documents out of a single file just yet. In theory, it might be possible to represent all real documents as one velite document that contains an array and then post-process the output of velite. But that does not feel right, at least long-term.

What do you think about this use case?

zce commented 2 days ago

Of course, this is also one of the built-in features.

You just need to define the standard json array or yaml array in a single file.

Velite even supports multiple files containing arrays.

kachkaev commented 2 days ago

Sounds great! Could you please point to an example of this, if possible? I must have missed something important in docs or the source code.

zce commented 2 days ago

https://github.com/zce/velite/blob/main/examples%2Fbasic%2Fcontent%2Ftags%2Findex.yml

kachkaev commented 2 days ago

I see... Brilliant! I was able to read a bunch of YAMLs mentioned above, by re-defining the yaml loader:

import { parseAllDocuments } from "yaml";
import { defineCollection, defineConfig, defineLoader, s } from "velite";

const yamlLoader = defineLoader({
  test: /\.(yaml|yml)$/,
  load: (vfile) => {
    return {
      data: parseAllDocuments(vfile.toString()).map(
        (document) => document.toJS() as unknown,
      ),
    };
  },
});

const fooBars = defineCollection({
  name: "FooBar",
  pattern: "path/to/sub-folder/*.yaml",
  schema: s.object({
    foo: s.number(),
    bar: s.string(),
  }),
});

export default defineConfig({
  root: "content",
  output: {
    data: ".velite",
    clean: true,
  },
  collections: { fooBars },
  loaders: [yamlLoader],
});

The key was to use yaml.parseAllDocuments instead of yaml.parse. If yamls are split with --- and a default loader is used, I get this error:

YAMLParseError: Source contains multiple documents; please use YAML.parseAllDocuments()

Thanks for your responses @zce and for your work on velite. It’s a great well-made library – I did not expect my migration from contentlayer to be so smooth!