spartanz / schemaz

A purely-functional library for defining type-safe schemas for algebraic data types, providing free generators, SQL queries, JSON codecs, binary codecs, and migration from this schema definition
https://spartanz.github.io/schemaz
Apache License 2.0
164 stars 18 forks source link

Migrations #61

Closed vil1 closed 5 years ago

vil1 commented 5 years ago

I can't believe how long it took to come up with this +718-194 change!

This PR introduces an usable way to express schema evolution (migrations).

The tale of how this piece of code was written shall be sung elsewhere, but here is the details you might need to make sense out of it.

Imagine you have a data type that looks like the following:

case class PersonV1(username: String, age: Int)

And suppose you've just added the age field. In other words, in the previous version of your code, you had:

case class PersonV0(username: String)

I've named them PersonV0 and PersonV1 for clarity, but they are really two versions of a single type Person.

The "evolution problem" can then be stated as: using only PersonV1, how to build a reader (for any given format) that will read serialized forms of PersonV0 into PersonV1 instances (backward compatibility) and a writer that will serialize instances of PersonV1 into messages that are compatible with the (old) PersonV0 format (forward compatibility)?

This PR solves this problem in two steps.

The SchemaZ phantom type

The SchemaZ[Repr, A] type alias is just a "self-reflexive fix-point over a higher-order functor", no big deal ;).

More seriously, the smart constructors in SchemaModule now return SchemaZ[Repr, A] instead of Schema[A]. The Repr phantom type is a type-level materialization of the structure of the underlying schema.

SchemaZ[Repr, A] reads as: "(a schema of) the type A represented as Repr".

Given this SchemaZ type, a migration of a schema for A is a function SchemaZ[R0, A] => SchemaZ[R1. A], that is, a function that changes only the way the type A is represented (for example by discarding a field and replacing it with a default value).

It is important to note that, in order to provide them as a library, these SchemaZ[R0, A] => SchemaZ[R1. A] functions must be polymorphic in R0 and A and, moreover, the R1 type depends on R0.

Versioning

It turns out that this kind of "dependently typed functions" is too complicated for scalac in general. It works only for migration that add/remove/rename a field of a record (or a branch of an union) but not for migrations that affect arbitrarily nested fields (or branches).

In order to use migration, one therefore must break their schema into smaller pieces, building bigger schemas using the smaller ones.

The Versioning module help doing so without having to rebuild the whole graph for each version manually.

Its schema is used to register "schema constructors", that are functions of zero up to five arguments of the shape (Schema[A0], ..., Schema[A5]) => SchemaZ[Repr, A].

Its migrate method allows to apply a migration to a registered schema. This migration will automatically be propagated to all the registered schemas that depends on (contain) the migrated schema.

Finally, its lookup method is used to retrieve a schema within a version, to be used for typeclass derivation.

Caveat

codecov-io commented 5 years ago

Codecov Report

Merging #61 into prototyping will decrease coverage by 2.74%. The diff coverage is 77.65%.

Impacted file tree graph

@@               Coverage Diff               @@
##           prototyping      #61      +/-   ##
===============================================
- Coverage        73.48%   70.73%   -2.75%     
===============================================
  Files               15       18       +3     
  Lines              396      516     +120     
  Branches            12       11       -1     
===============================================
+ Hits               291      365      +74     
- Misses             105      151      +46
Impacted Files Coverage Δ
...ules/play-json/src/main/scala/PlayJsonModule.scala 0% <ø> (ø) :arrow_up:
modules/core/src/main/scala/Json.scala 92.85% <100%> (-7.15%) :arrow_down:
modules/core/src/main/scala/recursion.scala 100% <100%> (ø) :arrow_up:
...dules/tests/src/main/scala/GenModuleExamples.scala 95.23% <100%> (ø) :arrow_up:
modules/tests/src/main/scala/JsonExamples.scala 65.71% <100%> (-0.96%) :arrow_down:
modules/tests/src/main/scala/ShowExamples.scala 75% <100%> (ø) :arrow_up:
modules/tests/src/main/scala/Person.scala 100% <100%> (ø) :arrow_up:
modules/tests/src/main/scala/Main.scala 100% <100%> (ø) :arrow_up:
modules/core/src/main/scala/migrations.scala 33.33% <33.33%> (ø)
modules/core/src/main/scala/Versioning.scala 73.91% <73.91%> (ø)
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 1dfadb7...00ead51. Read the comment docs.

vil1 commented 5 years ago

@GrafBlutwurst this indeed deserves a lot of carefully crafted documentation. It also needs a way to make compilation error messages more informative (less freakin' scary if you prefer).

But I'd like to make sure that this approach is the good one before I commit to write thorough documentation for it.

After this PR gets merged, my plan is to:

Depending on how it'll turn out, we might need to make drastic changes to the whole thing, making the documentation efforts undertaken in the meantime pretty useless.