AlgebraicJulia / Catlab.jl

A framework for applied category theory in the Julia language
https://www.algebraicjulia.org
MIT License
614 stars 58 forks source link

Breaking change to data migration API #792

Closed KevinDCarlson closed 1 year ago

KevinDCarlson commented 1 year ago

This PR separates DataMigrations from DataMigrationFunctors. The former is currently just a functor from a schema to another schema or a category of diagrams in (diagrams in) a schema. For upcoming functionality, a DataMigration M contains a dictionary params intended to contain some Julia functions describing the attributes migrate(X,M) should have in terms of the attributes of X. The semantic reason for this change is that a data migration is not the same thing as a functor--the functor tells you how to migrate, but isn't identical to the migration. But it was forced, practically, by the need for those params. The existing QueryDiagrams handle the need for params just fine when we're doing, well, queries, but not when we're doing mutations, as I'm working on elsewhere right now.

In more detail, we have a new tree of types rooted at AbstractDataMigration, which always has a func method getting the underlying functor, with children AbstractContravariantMigration and AbstractCovariantMigration. Depending on the codomain of the underlying functor, an AbstractContravariantMigration might be (identical to, not a supertype of) a DeltaSchemaMigration, a ConjSchemaMigration, etc. The only struct subtyping AbstractContravariantMigration thus far is DataMigration, which contains a functor and the aforementioned dictionary of parameters. If the dictionary type's values are in Union{}, you have a TotalDataMigration, which might in particular be a TotalDeltaMigration, and which can be constructed using just a functor.

There's also some stuff for SigmaMigrations and DataMigrationFunctors, where there's not too much going on of interest.

The @migration macros now always return a DataMigration, and all methods of migrate now expect an AbstractDataMigration rather than a Functor. It would be easy and might be good quality of life to add back in methods accepting just the underlying functor for TotalDataMigrations, but that won't be a breaking change so I've ignored it for now.

The trickiest thing to decide on here was how to handle the relationship between DataMigrations and QueryDiagrams, which are almost isomorphic structs. The only objective difference is that QueryDiagrams include a parameter identifying which diagram category they live in. We were originally planning to eliminate QueryDiagrams entirely and make even the output of DiagrammaticPrograms.make_query be a DataMigration, but that would require making DataMigrations carry around a variance parameter and some fiddling with implementing limit etc for DataMigrations, so I've left QueryDiagrams alive. Another option would be to just add the variance parameter so that `DataMigration==QueryDiagram``, and probably avoid all the new named types, but that seems unaesthetic and a bit disorganized to me. Happy for thoughts on this issue, though.

A side note: this PR involved the first error message I've ever seen that overflowed my entire terminal! That is, I had to pipe it to a text file to be able to scroll all the way to the top. These unergonomic error messages are a pretty nontrivial time cost. I think we've discussed and agreed that only Julia can do much about this issue, just wanted to note it again.