ascorbic / astro-loaders

Astro loaders
https://astro-loaders.netlify.app
48 stars 5 forks source link

[csv-loader] Support for custom `dynamicTyping` option #50

Open marceloverdijk opened 1 week ago

marceloverdijk commented 1 week ago

When loading the CSV the papaparse dynamicTyping seems always be set to true, and cannot be overridden:

https://github.com/ascorbic/astro-loaders/blob/main/packages/csv/src/csv-loader.ts#L55C1-L60C8

    const csvStream = Papa.parse(Papa.NODE_STREAM_INPUT, {
      dynamicTyping: true,
      ...parserOptions,
      header: true,
      transformHeader: transformHeader === false ? undefined : transformHeader,
    });

parserOptions?: Omit<
    Papa.ParseConfig,
    "header" | "dynamicTyping" | "transformHeader" | "step" | "complete"
  >;

(same for header and transformHeader, step and complete).

The issue is I have a CSV like:

"name","category","priority"
"foo","A1",1
"bar","A2",2 
"xyz","1",1 

Where category is a string (all values are quoted as well), but when parsing "1" it still converts it to a number (I believe because of dynamicTyping = true) which result in a terminal error:

AstroError [InvalidContentEntryDataError]: **persons → xyz** data does not match collection schema.
**category**: Expected type `"string"`, received "number"

I think in this case I should configure the csv loader like:

const persons = defineCollection({
  loader: csvLoader({
    fileName: 'data/persons.csv',
    transformHeader: false,
    idField: 'name',
    parserOptions: {
      dynamicTyping: (field) => field !== 'category',
    },
  }),
  schema: z.object({
    name: z.string(),
    category: z.string(),
    priority: z.number().int(),
  }),
});
ascorbic commented 1 week ago

Yes, I think something analogous to the glob loader's generateId function would make sense here.

marceloverdijk commented 1 week ago

@ascorbic I had my data in both json and csv format and I think using the out-of-the-box file loader to load the JSON might be a better solution than using csv.

However the file loader also the limitation that it cannot generate ids.

After you mentioned the glob's generateId option I changed the file loader my self to be able to generate ids, and that works perfectly.

I've created a feature request and a PR here

https://github.com/withastro/roadmap/discussions/1045

https://github.com/withastro/astro/pull/12308

I'm looking forward to your feedback.