Feat: Codecs refactor - Githubissues

Abstract

Codecs are a vital part of a database driver, it's critical that they work correctly and fast. Before, the codec structure of EdgeDB.Net were simple and limited, with this PR it redefines the codecs functionality, performance, and expandability.

With the issue of #27, I wanted to create a way to A: fix the underlying precision issue. B: add custom types to explicitly define EdgeDB datetime types (useful for codegen, qb, etc). C: add implicit conversion between .NET system temporal types and EdgeDB temporal types.

Changes overview

With the new changes, I've rendered a map of all the codec types:

New datatypes

New temporal types have been added to the `EdgeDB.DataTypes` namespace:	.NET type	EdgeDB type
DateDuration	cal::date_duration
DateTime	datetime
Duration	duration
LocalDate	cal::local_date
LocalDateTime	cal::local_datetime
LocalTime	cal::local_time
RelativeDuration	cal::relative_duration

These data types all follow EdgeDB's precision (usually microseconds). When converting from system temporal types, the driver will round to the nearest microsecond.

Complex codecs

Complex codecs are codecs that can define runtime codecs that serialize and deserialize transient representation of datatypes. The temporal codecs are a good example of complex codecs.

Structure A complex codec is defined with a root converting type and a transient model. The transient model and runtime converting types must be defined as a struct, this is to allow the transient model to hold information about the model as bytes. The complex codec exposes ways to convert:

The root converting type to the transient form.
The transient form to the runtime converting type.

With complex codecs, the codec can be safely stored in a cache referencing the root converting types EdgeDB type ID and act as a broker for any supporting types the codec allows.

With the addition of complex codecs, the System.Range type can now be used with the edgedb datatype range<int32>.

Compilable codecs

Compilable codecs are codecs that require contextual information (like .NET type information of values supplied to/deserialized from the query) to build. The main reason for this type of codec is to merge in the step of 'walking' a codec tree for type results[^1] with determining the correct generic arguments for generic codecs that require contextual information to determine.

As mentioned, a crucial use case that lead to the creation of this type of codec was the need for a create now -> configure later type of codec. This allows for codecs like Array<T> to be created with a inner complex codec that can be configured for any supporting inner types without recreating the generic definition of the codec, by compiling it when the inner complex codec is configured correctly.

This does mean that codecs now have to be stateless (which is ideal overall). This allows the compilable codec to be safely cached and reused.

Codec visitors

As expressed above, there is a need to 'walk' the codec tree to correct it based on type context. For this, I wrote a CodecVisitor to walk the tree and mutate the codecs for different contextual needs.

The TypeVisitor walks a given codec tree. It's responsible for:

Compiling CompilableCodecs to their corrected form for a given type.
Initializing object codecs with result type information for deserialization.
Getting the underlying codec for the current type from a complex codec.

Thus, the codec visitor allows for all relating operations of modifying the codec tree to be unified into one iteration.

[^1]: See the old ObjectEnumerator implementation for initializing object codecs with result types.

edgedb / edgedb-net

Feat: Codecs refactor #28