There is still much to be done in evolution.go but I have a good handle on that. Examples:
Consider NOT using Annotations to capture schema changes (it works fine, but it's verbose and error prone)
Correct conversions for scalar <-> Union type changes
Handle Union <-> Union type changes (e.g. adding/removing a Type)
Capturing TypeDefinition changes, e.g. to warn about added/removed non-Optional Record fields
Detect changes to TypeArguments
Other TODOs in code
The changes to the included C++ binary headers distinguish between schema_ and previous_schemas_ only to avoid breaking the NDJson and HDF5 code. This would be cleaned up and probably use just a single vector of schemas.
Codegen is not using the version label specified in the package file. Once the schema is known by the Protocol Reader/Writer, it just uses the schema index to determine which serializers to call.
Need to determine the best way for a User to instantiate a Protocol Writer for an older version of a Protocol.
Currently, the User must have instantiated a Protocol Reader r using an older schema, then say MyProtocolWriter w(stream, r.GetSchema())
We could generate unique constructors for each version, thereby utilizing the version label specified in the package file.
Binary codegen needs a bit more cleanup to remove duplicate code for type conversions. Thoughts on the switch(schema_index_) {...} approach?
The example models and C++ code (within evolution/) are just a starting point for integration tests
The current approach in yardl for supporting for model evolution involves:
Details on this rough draft PR:
The relevant changes are in:
There is still much to be done in
evolution.go
but I have a good handle on that. Examples:Annotations
to capture schema changes (it works fine, but it's verbose and error prone)scalar <-> Union
type changesUnion <-> Union
type changes (e.g. adding/removing a Type)The changes to the included C++ binary headers distinguish between
schema_
andprevious_schemas_
only to avoid breaking the NDJson and HDF5 code. This would be cleaned up and probably use just a single vector of schemas.Codegen is not using the version label specified in the package file. Once the schema is known by the Protocol Reader/Writer, it just uses the schema index to determine which serializers to call.
Need to determine the best way for a User to instantiate a Protocol Writer for an older version of a Protocol. Currently, the User must have instantiated a Protocol Reader
r
using an older schema, then sayMyProtocolWriter w(stream, r.GetSchema())
Binary codegen needs a bit more cleanup to remove duplicate code for type conversions. Thoughts on the
switch(schema_index_) {...}
approach?The example models and C++ code (within
evolution/
) are just a starting point for integration tests