withastro / astro

The web framework for content-driven websites. ⭐️ Star to support our work!
https://astro.build
Other
46.48k stars 2.46k forks source link

Content layer validates already validated objects causing transforms to fail a validation #11737

Closed TheOtterlord closed 1 month ago

TheOtterlord commented 2 months ago

Astro Info

Astro                    v4.14.2
Node                     v22.5.1
System                   Linux (x64)
Package Manager          pnpm
Output                   static
Adapter                  none
Integrations             @astrojs/mdx
                         @astrojs/sitemap

If this issue only occurs in one browser, which browser is a problem?

No response

Describe the Bug

Any schema with a transform that gives a resulting object that does not pass the schema itself fails, because (I think) the same object is being parsed more than once.

Example schema

const blog = defineCollection({
  loader: glob({ base: "./src/data/blog/", pattern: "**/*.{md,mdx}",
  schema: z.object({
    ...
    // repro
    something: z.string().optional().transform(str => ({ type: 'test', content: str }))
  }),
});

What's the expected result?

The object is parsed and validated once

Link to Minimal Reproducible Example

https://github.com/TheOtterlord/content-layer-transform-repro

Participation

ascorbic commented 2 months ago

This is an artifact of how we're handling images. We re-parse cached data so that we can resolve images again. We probably need to work out a better way to handle this.

ArmandPhilippot commented 2 months ago

I'm not entirely sure if this is the same issue; sorry if it isn't (I can open another issue). I have a similar problem with dates. For example with the following config:

import { glob } from 'astro/loaders';
import { defineCollection, z } from 'astro:content';

const pages = defineCollection({
  loader: glob({ pattern: '**/*.md', base: './content/pages' }),
  schema: z.object({
    publishedOn: z
      .string()
      .refine(
        (value) =>
          /^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}(?<seconds>:\d{2}(?<milliseconds>\.\d+)?)?$/.test(
            value
          ),
        {
          message: 'Date must be of the form yyyy-mm-ddThh:mm[:ss[.mmm]]',
        }
      )
      .transform((str) => new Date(str)),
    excerpt: z.string(),
    title: z.string(),
  }),
});

export const collections = {
  pages,
};

And a home.md file in content/pages containing:

---
excerpt: "The homepage."
publishedOn: "2024-08-24T23:17"
title: Home
---

Welcome!

When starting the dev server, it will throw (and stop) with the following error:

[InvalidContentEntryFrontmatterError] pages → home frontmatter does not match collection schema.
publishedOn: Expected type `"string"`, received "date"

But publishedOn is indeed a string. If I removed the surrounding quotes, the dev server can start... But not sure when (navigating, stop/restart the server), the server will fail again with the same error because obviously this is a date and not a string this time.

I made a Stackblitz reproduction but you may need to stop the server (ctrl+c) and then restart it. The error seems inconsistent (it may work the first time but not after).

Note: with Content Collections, this format worked as expected.