real-logic / simple-binary-encoding

Simple Binary Encoding (SBE) - High Performance Message Codec
Apache License 2.0
3.08k stars 519 forks source link

[C++] Can't encode variable length string as per the samples #818

Closed jviotti closed 3 years ago

jviotti commented 3 years ago

I have the following schema that describes a user as per the GitHub API:

<?xml version="1.0" encoding="UTF-8"?>
<sbe:messageSchema xmlns:sbe="http://fixprotocol.io/2016/sbe"
                   xmlns:xi="http://www.w3.org/2001/XInclude"
                   package="baseline"
                   id="1"
                   version="0"
                   semanticVersion="5.2"
                   description="Simple Number"
                   byteOrder="littleEndian">
    <types>
        <composite name="messageHeader" description="Message identifiers and length of message root.">
            <type name="blockLength" primitiveType="uint16"/>
            <type name="templateId" primitiveType="uint16"/>
            <type name="schemaId" primitiveType="uint16"/>
            <type name="version" primitiveType="uint16"/>
        </composite>

        <composite name="varAsciiEncoding" description="Variable length ASCII String.">
            <type name="length" primitiveType="uint32" maxValue="1073741824"/>
            <type name="varData" primitiveType="uint8" length="0" characterEncoding="ASCII"/>
        </composite>

        <enum name="BooleanType" encodingType="uint8" description="Boolean Type.">
            <validValue name="F" description="False value representation.">0</validValue>
            <validValue name="T" description="True value representation.">1</validValue>
        </enum>
    </types>
    <sbe:message name="GitHubUser" id="1" description="GitHub User">
        <field name="login" id="1" type="varAsciiEncoding"/>
        <field name="id" id="2" type="uint32"/>
        <field name="node_id" id="3" type="varAsciiEncoding"/>
        <field name="avatar_url" id="4" type="varAsciiEncoding"/>
        <field name="gravatar_id" id="5" type="varAsciiEncoding"/>
        <field name="url" id="6" type="varAsciiEncoding"/>
        <field name="html_url" id="7" type="varAsciiEncoding"/>
        <field name="followers_url" id="8" type="varAsciiEncoding"/>
        <field name="following_url" id="9" type="varAsciiEncoding"/>
        <field name="gists_url" id="10" type="varAsciiEncoding"/>
        <field name="starred_url" id="11" type="varAsciiEncoding"/>
        <field name="subscriptions_url" id="12" type="varAsciiEncoding"/>
        <field name="organizations_url" id="13" type="varAsciiEncoding"/>
        <field name="repos_url" id="14" type="varAsciiEncoding"/>
        <field name="events_url" id="15" type="varAsciiEncoding"/>
        <field name="received_events_url" id="16" type="varAsciiEncoding"/>
        <field name="type" id="17" type="varAsciiEncoding"/>
        <field name="site_admin" id="18" type="BooleanType"/>
    </sbe:message>
</sbe:messageSchema>

Notice that some of the fields use the varAsciiEncoding type as exemplified in the sbe-samples directory: https://github.com/real-logic/simple-binary-encoding/blob/master/sbe-samples/src/main/resources/common-types.xml.

I'm generating C++ code out the schema like this:

java -Dsbe.target.language=CPP -Dsbe.output.dir=out -jar sbe-all/build/libs/sbe-all-1.20.3-SNAPSHOT.jar schema.xml

According to the samples (https://github.com/real-logic/simple-binary-encoding/blob/master/sbe-samples/src/main/cpp/GeneratedStubExample.cpp), I need to encode the ASCII string members using .putXXX() methods.

I'm expecting something like this to work, based on the docs & samples:

GitHubUser user;
...
user.wrapForEncode(buffer, offset, bufferLength)
  .putLogin(document["login"], document["login"].size());

However .putLogin() doesn't seem to exist, and in fact no put prefixed methods seem to be generated at all:

sbe/main.cpp:48:8: error: no member named 'putLogin' in
      'baseline::GitHubUser'
      .putLogin(document["login"], document["login"].size());
       ^
1 error generated.

These are all the case-insensitive occurrences of login in the generated GitHubUser.h file:

$ ack login
GitHubUser.h
285:    SBE_NODISCARD static const char *loginMetaAttribute(const MetaAttribute metaAttribute) SBE_NOEXCEPT
294:    static SBE_CONSTEXPR std::uint16_t loginId() SBE_NOEXCEPT
299:    SBE_NODISCARD static SBE_CONSTEXPR std::uint64_t loginSinceVersion() SBE_NOEXCEPT
304:    SBE_NODISCARD bool loginInActingVersion() SBE_NOEXCEPT
310:        return m_actingVersion >= loginSinceVersion();
316:    SBE_NODISCARD static SBE_CONSTEXPR std::size_t loginEncodingOffset() SBE_NOEXCEPT
322:    VarAsciiEncoding m_login;
325:    SBE_NODISCARD VarAsciiEncoding &login()
327:        m_login.wrap(m_buffer, m_offset + 0, m_actingVersion, m_bufferLength);
328:        return m_login;
1158:    builder << R"("login": )";
1159:    builder << writer.login();

The one corresponding at line 325 seems to just be a getter:

SBE_NODISCARD VarAsciiEncoding &login()
{
    m_login.wrap(m_buffer, m_offset + 0, m_actingVersion, m_bufferLength);
    return m_login;
}

Am I missing something?

mjpt777 commented 3 years ago

Variable length strings/data are not field, they are data according to the SBE specification. They must also respect the order of field, group, and then data. Look at the Car example. To better understand SBE it is best to start with the specification.

https://github.com/FIXTradingCommunity/fix-simple-binary-encoding/tree/master/v1-0-STANDARD/doc

jviotti commented 3 years ago

Ah, I see. It works now. Thanks for the super fast response!