arktypeio / arktype

TypeScript's 1:1 validator, optimized from editor to runtime
https://arktype.io/
MIT License
3.93k stars 59 forks source link

Allow string format as metadata associated with JSON schema #1110

Open semohr opened 2 months ago

semohr commented 2 months ago

Issue

When using the string.date.parse type in combination with toJsonSchema I expected the type to be parsed to string, maybe even with the format keyword (see json-schema).

Example

import {type} from "arktype";

const date = type("string.parse.date");
date.in.toJsonSchema(); // <-- Errors

This errors with Uncaught ParseError: Predicate $ark.isParsableDate is not convertible to JSON Schema.

Expected output:

{
"type":string",
 "format": "date"
}

Solution (proposed by @ssalbdivad)

Add format as a metadata key. This would have no effect on validation but would allow custom types like string.date that rely on non-serializable predicates in the type system to be converted to JSON schema.

The format key should be added to the output JSON schema alongside any other constraints. It should be specifically added to the non-serializable conditions it should replace, in this case a predicate for validating whether a string can be parsed as a Date. If multiple format constraints exist on the same IntersectionNode, they must be identical.

semohr commented 2 months ago

I had a small look into the source and you could probably just quickly patch it by adding

parsableDate.toJsonSchema = () => ({
"anyOf": [
  {
  "type": "string",
  "format": "date-time",
  },
  {
  "type": "string",
  "format": "date",
  }
]
})

into ark/schema/type/keywords/string/date.

I'm not sure if this is where you want to do that, further you probably want some tests for that.

ssalbdivad commented 2 months ago

I'll have to think about this a bit since format does not fit into the type system the same way pattern, divisor, etc. do.

If you're okay with a specific date format like ISO8601, you can use something like string.date.iso.parse:

const user = type({
    name: "string",
    birthday: "string.date.iso.parse"
})

const schema = user.in.toJsonSchema()

const result = {
    type: "object",
    properties: {
        birthday: {
            type: "string",
            pattern:
                "^([+-]?\\d{4}(?!\\d{2}\\b))((-?)((0[1-9]|1[0-2])(\\3([12]\\d|0[1-9]|3[01]))?|W([0-4]\\d|5[0-3])(-?[1-7])?|(00[1-9]|0[1-9]\\d|[12]\\d{2}|3([0-5]\\d|6[1-6])))([T]((([01]\\d|2[0-3])((:?)[0-5]\\d)?|24:?00)([.,]\\d+(?!:))?)?(\\17[0-5]\\d([.,]\\d+)?)?([zZ]|([+-])([01]\\d|2[0-3]):?([0-5]\\d)?)?)?)?$"
        },
        name: { type: "string" }
    },
    required: ["birthday", "name"]
}

I think having a way to add "format" as metadata to a string type though would be useful, so I'm going tweak this issue a bit to reflect the broader goal.

semohr commented 2 months ago

In theory format date or date-time means it complies with the RFC 3339 specification. I'm not sure if the current string.date.parse functionalities fulfill the specification so using the regex might event be the correct choice.

ssalbdivad commented 2 months ago

Yeah I can likely create some new keywords to specifically align with these standards, but generally these subtypes are stricter the further you chain them. string.date should be the most permissive, so it allows anything that can be parsed via new Date() that doesn't result in an invalid date.

Once we have the capability to associate JSON schema format as metadata, we'll have to ensure all the other format keywords are integrated with the type system as well.