Model Evolution - Githubissues

The current approach in yardl for supporting for model evolution involves:

Comparing the current model to each previous version
Annotating the current model with each detected change
Processing the annotations to emit user warnings/errors
In codegen, inspecting annotations to
1. Serialize previous versions of TypeDefinitions and Protocols
2. Convert between types where necessary

Details on this rough draft PR:

The relevant changes are in:
1. tooling/pkg/dsl/evolution.go
2. tooling/internal/cpp/include/detail/binary/header.h
3. tooling/internal/cpp/include/detail/binary/reader_writer.h
4. tooling/internal/cpp/protocols/protocols.go
5. tooling/internal/cpp/binary/binary.go
There is still much to be done in evolution.go but I have a good handle on that. Examples:
- Consider NOT using Annotations to capture schema changes (it works fine, but it's verbose and error prone)
- Correct conversions for scalar <-> Union type changes
- Handle Union <-> Union type changes (e.g. adding/removing a Type)
- Capturing TypeDefinition changes, e.g. to warn about added/removed non-Optional Record fields
- Detect changes to TypeArguments
- Other TODOs in code
The changes to the included C++ binary headers distinguish between schema_ and previous_schemas_ only to avoid breaking the NDJson and HDF5 code. This would be cleaned up and probably use just a single vector of schemas.
Codegen is not using the version label specified in the package file. Once the schema is known by the Protocol Reader/Writer, it just uses the schema index to determine which serializers to call.
Need to determine the best way for a User to instantiate a Protocol Writer for an older version of a Protocol. Currently, the User must have instantiated a Protocol Reader r using an older schema, then say MyProtocolWriter w(stream, r.GetSchema())
- We could generate unique constructors for each version, thereby utilizing the version label specified in the package file.
Binary codegen needs a bit more cleanup to remove duplicate code for type conversions. Thoughts on the switch(schema_index_) {...} approach?
The example models and C++ code (within evolution/) are just a starting point for integration tests

microsoft / yardl

Model Evolution #121