Closed ujos closed 3 months ago
For the future, it's better to provide XML for mentioned message because it's not always obvious. In this case, I guess your group is what I call "flat", meaning that it doesn't have any nested groups or data fields.
Here's the message:
<sbe:message name="msg1" id="1">
<field name="field1" id="1" type="uint32"/>
<group name="group1" id="30">
<field name="field1" id="1" type="uint32"/>
<group name="group2" id="20">
<field name="field1" id="1" type="uint32"/>
<group name="group3" id="20">
<field name="field1" id="1" type="uint32"/>
<data name="data" id="6" type="varDataEncoding"/>
</group>
<data name="data" id="6" type="varDataEncoding"/>
</group>
<data name="data" id="6" type="varDataEncoding"/>
</group>
<data name="data" id="6" type="varDataEncoding"/>
</sbe:message>
What should be the signature of the function? Also keep in mind that group/data size is not always known at compile-time so this function should also work at run-time.
In my practice it's just easier to use memory pool with slots large enough to hold any of your messages. For example, Euronext documentation specifies maximum size of their messages.
In my case the messages are really "flat" (as of today).
<type name="Exchange" primitiveType="char" length="10"/>
<sbe:message id="10" name="AddSubscription">
<field id="0" name="requestId" type="uint32"/>
<field id="1" name="instrument" type="Instrument"/>
<group id="2" name="exchanges">
<field id="4" name="name" type="Exchange"/>
</group>
</sbe:message>
In this case the expected function interface is:
struct AddSubscription {
static constexpr size_t getMinSize(size_t exchangesGroupSize);
};
Ok but I asked about signature not for your message, which is trivial, but for a more complicated one.
Ah... my vision is that function signature for the XML above should be following:
size_t getMinSize(
size_t group1TotalEntriesCount,
size_t group2TotalEntriesCount,
size_t group3TotalEntriesCount,
size_t group3DataTotalLength,
size_t group2DataTotalLength,
size_t group1DataTotalLength,
size_t dataLength);
Another option is to generate that function for the trivial cases only.
I think it can be simplified, instead of separate groupNDataTotalLength
we can have a single parameter totalDataLength
. This works because from groupNTotalEntriesCount
we know how many data
fields will be in memory, their total size is data1LengthTypeSize + ... + dataNLengthTypeSize + totalDataLength
, what do you think? Although I can't say I love this interface, it's quite easy to say in documentation that group sizes should appear in their XML order from top to bottom of the message.
instead of separate groupNDataTotalLength we can have a single parameter totalDataLength
Sounds reasonable.
Although I can't say I love this interface, it's quite easy to say in documentation that group sizes should appear in their XML order from top to bottom of the message.
I'm not sure I've got it. What do you mean?
BTW, what if instead of parameters we have a struct?
struct SomeGoodNameForTheStruct
{
size_t group1_total_entries_count;
size_t group2_total_entries_count;
size_t group3_total_entries_count;
size_t var_length_data_total_length;
};
In case if some group is removed or reordered, user gets a compile time error?
BTW, what if instead of parameters we have a struct?
There's no good place to define this struct. Its usage will be a pain, e.g. estimate_size_bytes(schema::messages::msgName<byte_type>::some_good_name{1,2,3})
. The only protection it can give is when something is removed from schema. When something is added, the best you can get is warning from compiler about uninitialized member (and not in all cases).
In case if some group is removed or reordered
Let's say that SBE doesn't allow to remove anything from schema and the only extension is supports nicely is adding field
s at the end of the block. Adding group
s and data
s works only for the top-level block and only at its end, meaning that if you have any data
, you can only add new data
, not group
. Reordering is not allowed at all.
The only "nice" way to have that protection is to tag each size value like estimate_size_bytes<msg_type>(sbepp::group_size<schema::messages::msg::group1>(1), sbepp::group_size<schema::messages::msg::group1::group2>(2), ...)
and it's very verbose. Too much for a functionality that's questionable on its own (because it works only if you pass correct sizes, i.e., relies on human factor).
Btw, what's your real-world case for it(the one from this issue seems a bit artificial)? The only one I can imagine is to pass max number of each group entries and data at compile-time and then use buffer of such size to create a message which is either: always the same, not always the same (in this case you need to have additional checks to verify that max values used for size estimation are correct).
When you want to do estimation at run-time, it's pretty questionable that implied overhead really worth it. In all these cases, why can't you just hardcode any large-enough buffer size and then run unit tests to always be sure that it's enough? I mean, this strategy of calculating buffer size each time doesn't look good, especially when you schema contains many messages. It's much more efficient to have a memory pool with slots large enough for all the messages instead of calculating and allocating a buffer of different size each time.
I think for the task I'm working on now I'll allocate 65KB on the stack. This storage must fit all the messages involved in communication.
In case if message is really big (there is one with the var-length array) and for some reason it does not fit the 65K, then as a fallback I will allocate the memory on the heap.
In order to implement this logic I need estimate_size_bytes()
function.
When you want to do estimation at run-time, it's pretty questionable that implied overhead really worth it.
Part of the math will be done by compiler at compile time.
OK, I'll try to implement it when I have time and let you know
@ujos you can check #43 for the current implementation. I'll keep it open for a couple of days waiting for your reply.
I need to finish one thing I'm working on now. It will take a week probably. Then I can upgrade to the latest version of the sbepp and test it. If you don't mind please keep it open for a week. I will do my best to check it ASAP.
OK, no problem
I've checked the code in the PR. It works great. I believe the ticket can be closed now. Thank you very much.
Could you implement the
constexpr
function to calculate the buffer size needed to encode the message.At this moment I calculate it manually like following:
It works, however I believe the
sbepp
library would benefit from having this function generated inside the parser header.While it is simple case with just one group, there are a couple of ideas from my side: