fabian-hiller / decode-formdata

Decodes complex FormData into a JavaScript object
MIT License
236 stars 7 forks source link

Suggestion: extract values from FormData based on Schema instead of manually declare decode params #10

Open frenzzy opened 6 months ago

frenzzy commented 6 months ago

As you probably know, FormData can contain both strings and files. Users may abuse the system by sending files instead of strings or by submitting numerous unknown fields.

Extracting all fields sent via form data poses a potential risk of consuming excessive computational resources. Using formData.get('field') is much more efficient in such cases than approaches like Object.fromEntries(formData.entries()), where we try to extract all the data from user input even if we are not going to use it.

Additionally, when you want to further optimize computational resources, you usually use the abortEarly flag for schema parser functions.

My suggestion is to write a function that will extract JSON data from FormData based on a provided schema, for example:

<form enctype="multipart/form-data" method="post">
  <!-- Product -->
  <input name="title" type="text" value="Red apple" />
  <input name="price" type="number" value="0.89" />

  <!-- Metadata -->
  <input name="created" type="date" value="2024-04-12T07:00:00.000Z" />
  <input name="active" type="checkbox" value="true" />

  <!-- Tags -->
  <input name="tags.0" type="text" value="fruit" />
  <input name="tags.1" type="text" value="healthy" />
  <input name="tags.2" type="text" value="sweet" />

  <!-- Images -->
  <input name="images.0.title" type="text" value="Close up of an apple" />
  <input name="images.0.created" type="date" value="2024-04-12T07:01:00.000Z" />
  <input name="images.0.file" type="file" value="a.jpg" />
  <input name="images.0.tags.0" type="text" value="foo" />
  <input name="images.0.tags.0" type="text" value="bar" />
  <input name="images.1.title" type="text" value="Our fruit fields at Lake Constance" />
  <input name="images.1.created" type="date" value="2024-04-12T07:02:00.000Z" />
  <input name="images.1.file" type="file" value="b.jpg" />
  <input name="images.1.tags.0" type="text" value="baz" />
</form>
import { decode } from 'decode-formdata';
import * as v from 'valibot';

// Create product schema
const ProductSchema = v.object({
  title: v.string(),
  price: v.number(),
  created: v.date(),
  active: v.boolean(),
  tags: v.array(v.string()),
  images: v.array(
    v.object({
      title: v.string(),
      created: v.date(),
      file: v.blob(),
      tags: v.array(v.string())
    })
  ),
});

async function server(formData: FormData) {
  const rawData = decode(ProductSchema, formData) // <= new signature
  const formValues = parse(ProductSchema, rawData, { abortEarly: true })
  console.log(formValues)
  /*{
    title: 'Red apple',
    price: 0.89,
    created: Date,
    active: true,
    tags: ['fruit', 'healthy', 'sweet'],
    images: [
      {
        title: 'Close up of an apple',
        created: Date,
        file: Blob,
        tags: ['foo', 'bar'],
      },
      {
        title: 'Our fruit fields at Lake Constance',
        created: Date,
        file: Blob,
        tags: ['baz'],
      },
    ],
  }*/
}

Implementation suggestion via pseudocode:

import type { BaseSchema } from 'valibot';

function decodeEntry<TSchema extends BaseSchema>(schema: TSchema, value: FormDataEntryValue) {
  if (value === '') return null
  switch (schema.type) {
    case 'string': {
      return value
    }
    case 'number': {
      const number = Number(value)
      return Number.isNaN(number) ? value : number
    }
    case 'boolean': {
      if (value === 'true') return true
      if (value === 'false') return false
      return value
    }
    // etc.
  }
}

function decode<TSchema extends BaseSchema>(schema: TSchema, formData: FormData, fieldName?: string) {
  switch (schema.type) {
    case 'object': {
      const value: Record<string, unknown> = {}
      for (const [key, subSchema] of Object.entries(schema.entries as BaseSchema[])) {
        const nextKey = fieldName ? `${fieldName}.${key}` : key
        let nextValue: unknown
        if (subSchema.type === 'array' || subSchema.type === 'object') {
          nextValue = decode(subSchema, formData, nextKey)
        } else {
          const entry = formData.get(nextKey)
          if (entry !== null) nextValue = decodeEntry(subSchema, entry)
        }
        if (nextValue !== undefined) value[key] = nextValue
      }
      return value
    }
    case 'array': {
      if (!fieldName) return undefined
      const subSchema = schema.item as BaseSchema
      const value: unknown[] = []
      for (const [key, entry] of formData.getAll(fieldName).entries()) {
        if (subSchema.type === 'array' || subSchema.type === 'object') {
          const nextKey = fieldName ? `${fieldName}.${key}` : String(key)
          value.push(decode(subSchema, formData, nextKey))
        } else if (entry !== null) {
          value.push(decodeEntry(subSchema, entry))
        }
      }
      return value
    }
    // etc.
  }
}

...or perhaps valibot should export parseFormData/safeParseFormData analogs to parse/safeParse, with the same function signatures and all the options they already support 🤔

Thoughts?

P.S.: Thank you for your enormous contributions to open source and especially valibot!

fabian-hiller commented 6 months ago

Thanks for your feedback and for sharing this idea! Yes, such an improvement is planned, but at the moment Valibot is taking too much of my time. Once Valibot v1 is out (and I had a little break to recover 😅), I will probably work on it.