composite types - Githubissues

QIvan commented 7 years ago

Hello! I have question about difference between composite type, group and message. I asked already here and Martin recommended me recreate issue in this project.

Can I make shorter a schema contains an entity which I want to be able to send either as a message or as a part of another message or as a group? For example I have an asset. In java classes it will be something like this

public class Asset {
    char[] isin;
    char[] name;
}

Obviously I should able to send asset as message. But asset can be part of quotation

public class Quotation {
    double quotation;
    Asset asset;
}

and a clients portfolio contains a list of assets

public class Portfolio {
    char[] portrolioName;
    Collection<Asset> assets;
}

if I right understood my schema will be something like this:

    <types>
        <composite name="groupSizeEncoding" description="Repeating group dimensions">
            <type name="blockLength" primitiveType="uint16"/>
            <type name="numInGroup" primitiveType="uint16"/>
        </composite>
    </types>
    <types>
        <type name="str" primitiveType="char" length="12"/>
        <type name="quote" primitiveType="double"/>
        <composite name="AssetType">
            <type name="isin" primitiveType="char" length="3"/>
            <type name="name" primitiveType="char" length="3"/>
        </composite>
    </types>

    <sbe:message name="Asset" id="1">
        <field name="isin" id="1" type="str"/>
        <field name="name" id="2" type="str"/>
    </sbe:message>
    <sbe:message name="Quotation" id="2">
        <field name="quotation" id="1" type="quote"/>
        <field name="asset" id="2" type="AssetType"/>
    </sbe:message>
    <sbe:message name="Portfolio" id="3">
        <field name="portrolioName" id="1" type="str"/>
        <group name="assets" id="2" dimensionType="groupSizeEncoding">
            <field name="isin" id="1" type="str"/>
            <field name="name" id="2" type="str"/>
        </group>
    </sbe:message>

In this schema fields isin and name repeats 3 times. Can I create a schema a little bit shorter?

Note that this is only example I know there is better way for this task.

donmendelson commented 7 years ago

I proposed a solution in this issue: "Reuse a common block of fields in multiple messages"#13. It was not accepted for SBE version 1.0, but we can reconsider it or adopt a better solution for version 2.0.

Let me give my view of the definition of the elements you asked about plus some more.

Message: the encoding of a unit of work at the application layer. It could be a request, a response, or an unsolicited event. A message is composed of fields and groups of fields.

Field: the smallest unit of semantics (business meaning). Semantics is abstract, but it is made concrete in SBE or other encoding. A field is associated with a wire format in SBE through a type.

Datatype: a context-free classification of data. Datatypes are reusable but carry little or no semantics. The standard ISO 11404 General Purpose Datatypes contains an excellent taxonomy. It is intended to be independent of encoding, platform and programming language. Datatypes may be primitive or derived from primitive types. In SBE schema, we have \<type> for simple types and \<composite> composed of two or more primitives.

I make a distinction between a composite datatype and a group a fields. A composite is still a datatype and thus is lean on semantics, while fields do carry business semantics. For example, a scaled number (one of the ISO 11404 types) is composed of factor and radix. But it carries no business meaning on its own. It could be used to represent a price, quantity, or monetary amount.

In the current SBE schema we do not have a structure for a group of fields. (Called a record in ISO 11404.) That is what you are asking for, and i proposed in issue #13.

We do, however, have a structure for an array of block of fields. That is called a repeating group in FIX or just \<group> in the schema.

To wrap up, let me introduce another concept and possible solution. All of the definitions of FIX messages, fields and other elements are contained by FIX Repository. The FIX specifications are generated from Repository, and a number of tools, such as FIXimate, consume it. The Repository schema does have an element for a reusable block of fields, called a component. (A repeating group is a specialization of a component.) In short, FIX already has a schema with what you are looking for, and it could potentially be adapted to SBE.

There is a project underway to enhance the Repository, called FIX Orchestra. It's motto is "machine readable rules on engagement." Among its features are richer mapping of FIX datatypes to various encodings, including SBE, capture of workflow as well as message structure, and machine parseable conditions for conditionally required fields and the like.

QIvan commented 7 years ago

Thank you very much for your rich answer! Sorry for long time it took me to think and to investigate a lot, thus making my answer as good as your one. =)

Datatype: a context-free classification of data. Datatypes are reusable but carry little or no semantics.

You absolutely right about that. In fact this is the main reason why I created this issue: I do not want to create a type with business semantic. But from time to time one message should be part of another one as single field or as collection. I think it's something like one to one and one to many data models.

Even Martin Tompson's sample has an engine as part of a car. My main question is what should I do if I need to be able to give a response about a car and an engine in the same system? Now I see 3 ways here. First one is normalization. I can create a id for each engine and in a car and ask a user make request by this id or even do something like HATEOAS. I think this is good way in theory but in practice it can be bad idea if you think about a performance. Second one is copy and paste. I think all advantages and disadvantages are obviously. Third one is to create the complex type engine and in the car message it stay as in Martin's sample and the engine message has only one field with complex type above. I think this way is against the idea that datatype is a context-free classification of data.

I found the best way is to make the opportunity to make a reference in a schema from one message to another. What do you think about that? Maybe I'm somewhere wrong or I miss something.

donmendelson commented 7 years ago

I agree with the idea of a reference to a common element, and that is the way the FIX Repository schema works. A common block of fields is defined by element \< component> in that schema. (A shared repeating group is a subclass of component.) For example, Instrument block contains fields for security ID, trading symbol, instrument type, and so forth. Approximately 20 different FIX messages use this component by reference. The component is edited in one place so there will be no copy and paste errors. If a new field is added to Instrument, it is available to all 20 messages. A component is imported into a message by using XML tag \<componentRef>. The SBE implementation may explode the component into its fields at the time the message schema is processed.

QIvan commented 7 years ago

Thank you @donmendelson for this discussion. I hope <componentRef> will be soon in the SBE standart. Will you think about something like <messageRef> tag for situation that I described above (when we need to send an engine as single message and it the same time as part of a car)?

FIXTradingCommunity / fix-simple-binary-encoding

composite types #47