zio / zio-schema

Compositional, type-safe schema definitions, which enable auto-derivation of codecs and migrations.
https://zio.dev/zio-schema
Apache License 2.0
140 stars 158 forks source link

Reduce the number of Schema to fundamental types #352

Open tusharmath opened 1 year ago

tusharmath commented 1 year ago

Problem

Currently If one needs to match on Schema these are the number of options they need to deal with -

        a match {
          case Schema.Primitive(standardType, annotations) => ???
          case enum: Schema.Enum[_]                        => ???
          case record: Schema.Record[_]                    => ???
          case collection: Schema.Collection[_, _]         => ???
          case Schema.Transform(_, _, _, _, _)             => ???
          case Schema.Optional(_, _)                       => ???
          case Schema.Fail(_, _)                           => ???
          case Schema.Tuple(_, _, _)                       => ???
          case Schema.EitherSchema(_, _, _)                => ???
          case Schema.Lazy(_)                              => ???
          case Schema.Meta(_, _)                           => ???
          case Schema.Dynamic(_)                           => ???
          case Schema.SemiDynamic(_, _)                    => ???
        }

I think we can reduce them to some fundamental types that are completely orthogonal. For eg: Tuple, Option and Either can be implemented using Record and Enum. Similarly, Transform Lazy Meta Dynamic and SemiDynamic seem to leak implementation detail rather than just the schema of the type.

I propose that we reduce the public schema types to more fundamental ones only.

Would love to know what everyone thinks about it.

jdegoes commented 1 year ago

I'm onboard with this, conceptually. However, we do have some issues to resolve:

  1. Many formats have explicit support for Option (e.g. nullable columns in SQL, optional fields in protobuf, etc.). Codecs need the ability to know the meaning of null. Possibly typeId in Transform node could allow us to hold onto that.
  2. Meta and SemiDynamic should be gone. As for Dynamic schema, some backends can directly store JSON-like content. While we can describe JSON-like content with a recrusive schema, it may be more difficult for backends to process.
  3. Laziness is necessary for recursive schemas.
vigoo commented 1 year ago

Another important thing is that if we delete some of these then it has to be gone from all the different zio-schema types. For example if we don't want to have an explicit Either then don't have it in Schema, in DynamicValue, etc. I had the opposite problem while working on zio-flow, that schema had Either and DynamicValue not, for example and then converting from/to dynamic values was loosing information.

Also keep in mind that transformations cannot be serialized so if there is any chance that two cases (let's say either vs an arbitrary sum type) needs to be differentiated that cannot be done only by transformations. Not sure if there are anything like this in the library.

Also it's possible that now that we have type IDs, some of these problems are easier to solve.