Open stephenh opened 8 months ago
This is a really interesting topic and one I've thought a lot about but am still not quite sure of my stance on.
When I was initially implementing morphs, I considered many solutions including a unique input/output that could be associated with a type. Especially working in the type system, I eventually determined that fundamentally, there was nothing special about input/output or serialize/deserialize compared to any other arbitrarily named transformations you might create for a type. Essentially, if you think about your types as a graph, really what you're defining is new edges (morphs) from the your vertex (type) to some other type (in this case string
).
I do think this is a very powerful concept (I created an empty issue a long time ago to follow up https://github.com/arktypeio/arktype/issues/574), but it seems to me bidirectional codecs are more an application of this larger morph-based graph than inherently useful in and of themselves, and in fact I've seen a lot of questions about why in @effect/schema
codecs must always be bidirectional when often it doesn't reflect a valid transformation in one direction, so you end up creating an implementation that just throws which feels very clunky to me.
That said, if you haven't seen https://github.com/Effect-TS/schema already it's a great library and might be just what you're looking for!
If you're interested in fleshing out some of what I described re: morphs I'd definitely be interested in working together to tackle that, especially after the 2.0 release.
Hello! Apologies from the really long delay in replying; your response made a lot of sense, but I was like "ah wow, 'graph of types'?! I'm going to really need to think about my reply...", and then well, finally got a chance to think/play with it today. :-)
there was nothing special about input/output or serialize/deserialize compared to any other arbitrarily named transformations you might create for a type.
I can see this being true...
if you think about your types as a graph, really what you're defining is new edges (morphs) from the your vertex (type) to some other type (in this case string)
I think I see what you mean; I hadn't actually tried morphs yet, so just now sketched out a "User type to JSON, JSON back to User type" prototype here:
https://github.com/stephenh/typescript-sandbox/blob/arktype/src/arktype.test.ts#L6
And, yeah, it looks like it works if I treat each "version of type" (User-as-json and User-as-pojo) as its own thing.
Which makes sense, that for Arktype's current API there is only ever "input type --> output type", so to go the other way around, it makes sense to just flip the "output type --> input type".
I think what's less than ideal, at least for this very specific use-case (storing a POJO as JSON, i.e. storing POJOs into jsonb
columns), is the duplication in defining User
twice.
Is there a way to avoid that? Ideally I'd like to just define the User
on its own, and have that definition know how to do In -> Out as well as Out -> In...
Which, per your link:
Oh nice! I knew of Effect, but have historically shied away from it b/c of naively assuming it would to too-monad-y; I wrote Scala for ~3-4 years and really enjoyed it, but personally more so from the "better Java" angle than the "getting Haskell on the JVM" angle. :-)
But they definitely have the type + encode + decode setup that I'm looking for, so I'll take a look!
If you're interested in fleshing out some of what I described
I'm definitely happy to chat over use cases and potential APIs!
At first I was assuming that supporting the codec pattern (being able to call encode
) would be a pretty large/breaking change to the Arktype API, because right now when I do userAsPojo(...json...)
it didn't seem clear "if that's (implicitly) decode, where would the encode method even go?"
But maybe something like:
const userDecoded = user.decode(json);
const userEncoded = user.encode(userDecoded);
Would work, and user(...)
stays as the common-case API/syntax-sugar for decoding.
Granted, morphs would need to learn to (optionally--maybe by default the encodeFn
is just value?.toJSON()
) be two-way (accept an encodeFn), maybe something like:
const user = type({
firstName: "string",
birthday: [LocalDate, "|>", decodeFn, encodeFn],
});
I know I flipped the order of "|>"
there (I put LocalDate
first instead of string
)...
At first that was an unintentional mistake, but actually I like it b/c imo it better communicates the "User.birthday
is going to be a LocalDate
" intent, and pushes the "here's how its encoded/decoded" to later in the tuple (maybe "two-way morphs" / codecs would have a different operator than |>
... <|>
maybe?).
Vs. the current API of birthday: ["string", ...]
almost makes it sound like birthday: string
is what User.birthday
will be typed as, until you see/realize that the decodeFn
tuple argument swaps it over to LocalDate
.
Anyway, that's my...mumble...months later thoughts! Thanks!
My intuition is that I'd rather have some way to define a group of "variants" of a type along with a syntax that allows you to conveniently transform between them, but I suppose that API could be defined in such a way that encode
and decode
could be defined and used in a way similar to what you describe.
I definitely want to revisit this after the next release once the core type system is stable.
🤷 Motivation
We have types like JS
Date
or an internalLocalDate
(only calendar dates, no times) that we want to encode/decode to JSON (or a string, like in a query string).Libraries like Zod/Arktype/etc generally support deserialization, i.e. here's an unknown blob object literal,
parse
/safeParse
it into my structure, and also transform thedueDate: "2024-01-01"
key into -->dueDate: new LocalDate(...)
during the process (zod transformers).But they don't have serialization, which is if I have an object
{ dueDate: someJsDate }
, and I want to serialize it, I want it to come out asdueDate: "2024-01-01"
and notdueDate: Date.toString()
orDate.toJSON()
which is some really long ISO string.Ideally I want to have a two-way codec, that for a given key, can do both custom serialization & deserialization.
Why should we prioritize solving it?
Because you're a new market entrant and hopefully will see this as an edge/feature to help drive your adoption. :-) :crossed_fingers:
💡 Solution
How do you think we should solve the problem?
Ideally with something similar to Zod's
transform
ers, but more of a "codec" or "serde" that does both serialization & deserialization.This would mean new high-level methods like
serialize
orformat
(opposite ofparse
).Why do you think this is the best solution?
This sort of approach keeps the schema in control of both serialization & deserialization, instead of being at the whims of whatever
.toString
/.toJSON
behavior the values happen to use.Did you consider any alternatives?
I've been looking for a runtime type library (Zod, see this comment, others) and the only one that does codecs/serde afaict is
io-ts
, which comes with too much baggage.Honestly I understand if you consider this out-of-scope, but I'm just really surprised that so far basically all of the runtime type libraries have somewhat myopically focused on "only parse" and forgotten the other half of serde.