Open OleksandrKvl opened 2 years ago
A decoder generated from an earlier version of a message schema cannot decode the new group anyway. The downside is that it cannot correctly determine the message length by walking the blocks. For that purpose, a framing header such as Simple Open Framing Header or WebSocket Protocol is recommended.
The extension mechanism section already states the constraint: "Message header encoding cannot change." This should be amended to say "Message header encoding and group dimension of existing repeating groups cannot change."
@donmendelson This section says:
Correct me if I'm wrong, the above is only partially true:
A decoder generated from an earlier version of a message schema cannot decode the new group anyway.
Well, if group header and var-data length
field types are known (for example defaulted) and message/group header contains numGroups
and numVarDataFields
, it should be possible because you would have all the information to calculate the size of each message level. I thought that was the original intent... Then what's the purpose of those numGroups
and numVarDataFields
on the wire?
Actually, here's the direct quote from here:
Number of repeating groups and variable data. Message headers and repeating group dimensions carry a count of the number of repeating groups and a count of variable-length data fields on the wire. This supports a walk by a decoder of all the elements of a message, even when the decoder was built with an older version of a schema. As for added fixed-length fields, new repeating groups cannot be interpreted by the decoder, but it still can process the ones it knows, and it can correctly reach the end of a message.
@OleksandrKvl yes, it was the intent of the extension mechanism to be able to walk the blocks of a message even if the design of the blocks is unknown to the decoder. Previous proposals for version 2.0 did not anticipate that the repeating group dimension and length of variable-length fields could change for new blocks.
You proposed the possibility of a default dimension to be used for new blocks.
Another possibility would be that new blocks use the same dimension previous groups within a message. However, that would not cover the case of adding the first repeating group or variable-length field to a message. In that case there would have to some kind of default anyway.
I open to any other proposal that anyone would like to make.
@donmendelson I thought that SBE V1 is kinda "final" and complete but it contains exactly the same extension logic which seems not possible to implement (I'm actually writing SBE parser, that's why I want to know the details).
Another possibility would be that new blocks use the same dimension previous groups within a message.
This also doesn't cover a (rare) case where groups have different dimension types. There's already messageSchema.headerType
, possible solution might be to add messageSchema.groupSizeEncoding
and messageSchema.varDataEncoding
with the same defaults as their names. Similar attributes can be added to the <message>
element to allow per-message customization. Then the only limitation to extension would be that new groups/var-data fields should use default types and of course numGroups/numVarDataFields
must exist.
At high level, the intention of the extension mechanism was to support simple addition of new fields to a block without breaking old parsers. The proposal to allow addition of new repeating groups was initially rejected by the working group, and it was not included in SBE v1.0. The thought was that if a message is to be radically changed, then you should just issue a new template.
Since it was requested again by some users, the proposal was revived as SBE 2.x release candidate. I want to emphasize that a release candidate is provisional and subject to improvement. I would encourage @OleksandrKvl or any other interested parties to make proposals to resolve ambiguities in the spec. Any proposal entered as PR will be considered.
@donmendelson before making such a proposal, I'd like to understand whether it's possible to change <data>
attributes in a way I proposed in #148 (in particular providing just length type instead of composite) since proposal directly depends on it.
Extension mechanism is based on required
blockLength
and optionalnumGroups
andnumVarDataFields
. The latter two tell how many groups/var-data fields we have but they tell nothing about their dimension types so I don't understand how such extension can be handled. Consider the first version of the message:And the second version which adds another group:
The reader of the new version will see
numGroups
set to2
but that group has different dimension type and it's not possible to correctly handle it without its definition. The same happens with var-data fields. New ones can have differenttype
which means differentlength
type so it's not possible to correctly handle them.For groups I can imagine potential "fix" in form of saying that since group dimension is unknown it should be treated as default
groupSizeEncoding
so only groups with thisdimensionType
can be safely added to the message. But the standard says nothing about default type for var-data encoding. Can you please clarify how it's supposed to work?