commonbaseapp / zodex

The missing zod (de)?serialization library
MIT License
38 stars 5 forks source link

Handle recursive schemas #13

Open brettz9 opened 1 year ago

brettz9 commented 1 year ago

Although Zod doesn't support circular data, it does support recursive schemas, so it would be nice to support these. However, to implement this and track the current path to ensure we don't recurse, I think we'd either need to pass on a state object for each zerializer method (and potentially as part of the public function API) or if you didn't prefer that route, changing to a class which can track state internal to itself. Either one of those or I guess we could add each object to a global WeakMap to verify it hasn't been traversed. Thought I'd confirm what approach might be desired before attempting a PR.

Gregoor commented 1 year ago

I'm leaning towards leaving this as a caveat and not trying to solve it, due to its complexity and us not having the use-case. That said, if you feel strongly about having it / have the use case (or anyone else), I'd definitely be open to look at a PR. I just can't promise that it will definitely get merged.

brettz9 commented 1 year ago

Recursive schemas come up frequently in JSON Schema, for example:

const json = {
  'en-US': {
    settings: {
      general: {
        editing: {
          message: 'Editing'
        }
      },
      styles: {
        color: {
          text: {
            message: 'Text color'
          }
        }
      }
    }
  }
}

Here's another kind of example that could be ported to Zod: https://stackoverflow.com/a/57926806/271577 (not too unlike the subcategories example at https://github.com/colinhacks/zod#recursive-types ).

Maybe I'll just wait though in offering a PR (to see what you think then) until you might have a chance to complete the dezerial branch. I know I've added some work for merging as it is. Any idea when you think you might be able to visit completing that branch?

Gregoor commented 1 year ago

There is an interesting split here, where your inline example does not seem to require recursive/lazy types, it could be schemad like this:

const messageSchema = z.object({ message: z.string() });
z.object({
  settings: z.object({ general: z.object({ editing: messageSchema }) }),
  styles: z.object({ color: z.object({ text: messageSchema }) }),
});

and ofc we'd lose the reference in zodex and this would just be one reference-free but larger schema tree.

Then in the SO json schema example we do have these references, which zod has to solve for with the lazy wrapper to hack around ReferenceErrors.

I guess if we wanted the above to serialize into a json-schema-like, then we'd have to z.lazy()-wrap messageSchema and then zodex has to bubble that schema's definition up to the furthest objects where there's still reference-sharing siblings or sth?

So yeah, the picture is not forming yet for me, but maybe it can be simple than I am imagining right now.


wrt dezerial: I need to find some deep focus to tackle it, the blockers are, fittingly for this subject, recursive types and their limits. I'm thinking about dropping type-refinement for now, as that has caused a lot of headaches for me so far, which would mean zodex would just return the generic SzType for zerialize and ZodTypeAny for dezerialize.

brettz9 commented 1 year ago

There is an interesting split here, where your inline example does not seem to require recursive/lazy types, it could be schemad like this:

const messageSchema = z.object({ message: z.string() });
z.object({
  settings: z.object({ general: z.object({ editing: messageSchema }) }),
  styles: z.object({ color: z.object({ text: messageSchema }) }),
});

Sure, but the issue is that sometimes one wishes to have a less explicit schema, as it remains more flexible for one's purposes. The intent in this example which I could have been more clear about was that the various items of localization here could most simply be defined with a recursive schema (and without needing to be locked into a particular structure): specifying arbitrarily nested objects with a string message. There wouldn't need to be any creation of a separate schema for settings, styles, etc. One should technically always be able to spec out an arbitrarily long nesting and avoid recursion, but the issue is that it becomes cumbersome to do so, and the app may not really benefit from higher specification (e.g., if it simply uses a split on . to find the keys here and one didn't need to access the interim keys like "settings" or "styles" to ensure only the exact spelling was used).

Then in the SO json schema example we do have these references, which zod has to solve for with the lazy wrapper to hack around ReferenceErrors.

I guess if we wanted the above to serialize into a json-schema-like, then we'd have to z.lazy()-wrap messageSchema and then zodex has to bubble that schema's definition up to the furthest objects where there's still reference-sharing siblings or sth?

So yeah, the picture is not forming yet for me, but maybe it can be simple than I am imagining right now.

I haven't frankly given too much thought to the deserialization approach besides recognizing there'd be a need for some such wrapping.

wrt dezerial: I need to find some deep focus to tackle it, the blockers are, fittingly for this subject, recursive types and their limits. I'm thinking about dropping type-refinement for now, as that has caused a lot of headaches for me so far, which would mean zodex would just return the generic SzType for zerialize and ZodTypeAny for dezerialize.

SGTM...

jadedevin13 commented 1 week ago

Created a pull request for this https://github.com/commonbaseapp/zodex/pull/23

brettz9 commented 3 days ago

Btw, FWIW, there is a slight disadvantage with JSON references in that they cannot have another property on the same object like this:

  "rest": {
    "$ref": "#/$defs/reference-or-type"
    "isOptional": true
  }

...one must instead do something like this:

  "rest": {
    "type": "union",
    "options": [
      {
        "$ref": "#/$defs/reference-or-type"
      }
    ],
    "isOptional": true
  }

(JSON schema doesn't have this problem because its version of isOptional is not put on the individual properties, but on the object (i.e., through required). However, it suffers from not being able to include the optional status on the individual property which can be convenient.)

But the advantages of using such a standard include not only being able to resolve them with existing tools from remote sources, but also VSC lets you cmd-click a JSON reference and go to the referred location:

image