Open skunkiferous opened 11 years ago
The generic object model, if implemented, should probably be based on Strings, rather the IDs, to solve the issue with ID mapping differences between schemas. The generic object should use fully qualified names of the properties, to avoid issues where multiple properties with the same name are defined through inheritance. The class name of the type should be written in the "class" property (since it is an invalid property name). Also arrays, except maybe primitive arrays, must also be represented as generic objects, since we want to keep track of the array "class" too.
The advanced/automatic migration scenarios have been documented in the wiki. One important point is that the "streaming migration" only works if there is a single version difference in the schema, because there is no way to combine multiple streaming migrators. Therefore it seems worthwhile to concentrate on the object-model migrations.
This issue can be close when #40 and #41 are closed.
There are multiple possible levels of data-migration support. The lowest level is manual migration, and therefore the one we should aim for at first.
This means the code must know the schema version of the stream (use context), and it must be possible to check it while de-serializing. One way is to check the version in each template. Another would be to have alternative template implementations, based of the schema version. The following facts must be take into account:
1) The IDs change between format versions. 2) While new classes are not an issue, it might be that classes in the old format have gone. 3) There are two major approaches, both with pros and cons: inline migration like ASM's streaming model/STAX, and the generic model, like ASM's Object model, or DOM) The streaming migration is limited in capabilities, but is generally fast, and has little memory overhead. The generic approach is most powerful, but expensive to program and use. Code generation might help for both approaches.