spartanz / schemaz

A purely-functional library for defining type-safe schemas for algebraic data types, providing free generators, SQL queries, JSON codecs, binary codecs, and migration from this schema definition
https://spartanz.github.io/schemaz
Apache License 2.0
164 stars 18 forks source link

Alternative schema encoding #26

Closed vil1 closed 5 years ago

vil1 commented 5 years ago

This is a proposal for a new way to encode schemas.

The core encoding relies on three "essential" nodes, One, :+: and :*: (unit, sum and product) that should be theoretically enough to represent and types. For example, Option[A] is now represented as A \/ Unit.

The encoding also provides "extra" nodes to ease the representation of records, unions and sequences.

The main selling point for this encoding is that it make the whole tree homogenous: all nodes are Schemas as opposed with the previous encoding based on FreeAp in records.

This makes the use of recursion schemes (see cataNT) rather easy.

GrafBlutwurst commented 5 years ago

There is an issue where Representation and a given Schema don't share the same types of term ids. in the Json case I think this leads currently to a subtle toString which might come and bit us later. We already have some proposals on the table to fix this see: https://gitter.im/scalaz/scalaz-schema/archives/2018/12/20

But I feel that this should probably be fixed in this PR to avoid unexpected behavior down the line

vil1 commented 5 years ago

Representation (which is probably poorly named) is a way to separate what depends only on the target functor from what depends only on the structure of the schema.

All target functors must have some minimal capabilities to be able to reflect a schema's structure, but some might also need to perform additional operations to represent records or unions.

The former capabilities are given by the Alt / Decidable instances while the latter are bundled in the Representation instance.

By default, handling a record (resp. an union) involves handling the product (resp. the sum) it contains, via the target functor's Alt/Decidable instance, and mapping the result pair through the Iso. Some targets, like Gen don't need any further processing, but others like Encoder needs some additional operation (ie to wrap the result in curly braces).

That's why some methods in Representation have a default implementation that does nothing (@julienrf is that what you refer to as incorrect default implementation?).

The goal of all that is for clients wanting to project schemas onto their own target functor to focus only on the specificity of their functor and forget about how the whole schema's structure is handled.

Regarding the conversion of ProductTermId and SumTermId (which my get renamed at some point), the current implementation of Encoder indeed relies on a hacky toString, but that's a mere shortcut (please consider that Encoder is a toy example that will be replaced with something more "real-world" later).

GrafBlutwurst commented 5 years ago

LGTM :+1: