OleksandrKvl / sbepp

C++ implementation of the FIX Simple Binary Encoding
https://oleksandrkvl.github.io/sbepp/
MIT License
39 stars 4 forks source link

Misleading operator* for the required fields #23

Closed ujos closed 4 months ago

ujos commented 4 months ago

What is the intent for the operator* in the sbepp::uint8_t and all other base types? From my perspective it is misleading. Every time when I do cout << *msg.field(), I think I did not check for null.

OleksandrKvl commented 4 months ago

Every time when I do cout << *msg.field(), I think I did not check for null.

Well, then do cout << msg.field().value();. Optional/required representation classes are wrappers and they should provide access to the underlying value. operator* is just a short way to access that value. Optional wrapper was strongly influenced by std::optional interface that provides value()/operator*() for the same purpose and required wrapper followed the same approach because value() is needed anyway and operator*() is just a nice shorthand for it.

ujos commented 4 months ago

For the required fields instead of operator* you can have operator T and value() function.

However, there is one case for the operator* in scope of the required fields. In case if you have two versions of the schema, and the message X v2 had got one more required field AAA, and you receive the X message v1, then required field AAA is absent.

OleksandrKvl commented 4 months ago

For the required fields instead of operator* you can have operator T and value() function.

Nope, operator T is a nightmare, I'm strongly against it, can't remember last time I used it. In theory, required wrappers can have only value() that returns mutable reference but I think it's not worth it. It's longer to type and will introduce some divergence between optional and required wrappers.

ujos commented 4 months ago

In this case your type wrapper should mimic the behaviour of the C++ base types.

ujos commented 4 months ago

Or return uintX_t directly and do not use your types.

OleksandrKvl commented 4 months ago

With naked types you lose the ability to get min/max values, in_range check and just general strong-type benefits (each SBE type is represented by a distinct C++ type).

ujos commented 4 months ago

Nope, operator T is a nightmare,

Why is that?

OleksandrKvl commented 4 months ago

I'm lazy to google examples for you. The one I can give you right now is when you have operator int, your class becomes implicitly comparable to any ints and other built-in types that are convertible to/from it. It basically breaks incapsulation.

The approach sbepp uses is strong types, wrappers that provide restricted interface to the underlying value without implicit conversions to make things compile-time safe. I don't want speed(int16) field be comparable with length(uint32) one.

Misleading operator* for the required fields

It's not misleading if you read the docs. Wrapper is just a container for a value, and operator* is perfectly fine (and short) to access it.

no asserts while reading the optional fields if they are empty/null. Library should follow the std::optional approach. If field is optional, then force user to check for NULL before you read the value.

Again, optionals in SBE are not the same as std::optional. The latter physically doesn't have a value to return when empty, the former always has a value, even if empty, and it's not an error to get it. Here, semantics is up to client, don't want empty values? Then write if-statement or use value_or. In my experience, there were cases when I need to work directly with underlying values so one of sbepp goals was to not impose any semantic restrictions on the client.

ujos commented 4 months ago

The approach sbepp uses is strong types, wrappers that provide restricted interface to the underlying value without implicit conversions to make things compile-time safe. I don't want speed(int16) field be comparable with length(uint32) one. ... Again, optionals in SBE are not the same as std::optional.

I believe that approach overcomplicates the SBE parser implementation and makes the users' experience not as smooth as it could be. I believe the more things you reuse from the standard C++, the easier is the entrance for the user. Users know what is std::optional and what is int32_t. Users know the std::nullopt variable. The C++17 (so C++20) has many good primitives.

From other side it may be painful to learn new types. Especially if they do not behave as expected and the users have to read the documentation. Believe me, I'm on their side now :)

As an input users have int/double/string. So as an output they expect is int/double/string too.

  int -> SBE -> int

In case if an output is something else, then you force them to learn the things they probably do not have to learn. Again, what user wants is to put int and get int. They do not need an intermediate value.

You said sbepp::uintX_t defines min, max and other things. Ask the question to yourself, does the user really need them? Those values are needed for the validation only. So, instead of min/max they would probably want to have a message1::validate() member function instead.

ujos commented 4 months ago

If I may you give a hint, as an user, following is my expectation for the encoder:

// Figure out what is the minimum buffer size needed to encode the message with 3 entries in the group
// In case of nested groups, the total amount of nested group items must be passed to the function
auto buffer = std::array<char, Message1Encoder::calcSize(3)>{};

auto msg = Message1Encoder{buffer};
msg.optional(123);
msg.madatory(123);

auto exchanges = msg.exchanges(3);
exchanges[0].name("CME");
exchanges[1].name("NYSE");
exchanges[2].name("CBOE");

socket.send(data(msg), size(msg));

And following is my expectation for the decoder:

void onMessage(Header hdr)
{
  switch(hdr.templateId())
  {
    case Message1::TemplateId:
    {
        auto msg = Message1{hdr};
        std::cout << (msg.mandatory() != 10 ? "OK" : "10 is not supported") << "\n";
        if(auto opt = msg.optional()) std::cout << *opt << "\n";
        for(auto&& exch: msg.exchanges()) {
            std::cout << exch.name() << "\n";
        }
    }
    break;
  }
}
OleksandrKvl commented 4 months ago

I believe that approach overcomplicates the SBE parser implementation and makes the users' experience not as smooth as it could be.

Again, the goal of this project is not to provide "smooth user experience" but to implement SBE standard as close as possible to C++. I personally believe that SBE should be only about raw data without min/max/null values, so what? I don't have a control over SBE specification so I just obey their rules (mostly). All your words about what users need should be addressed to SBE, not to its implementation. Btw, another goal of sbepp was not to restrict users with semantics whenever possible but to provide a basic interface that allows them to use any strategy they need. Want to throw on empty optional? Write a trivial function that does that. Don't need min/max? Just add * and work with naked values.

I believe the more things you reuse from the standard C++, the easier is the entrance for the user. Users know what is std::optional and what is int32_t. Users know the std::nullopt variable. The C++17 (so C++20) has many good primitives.

I already told you that std::optional is different, not mentioning that it has freedom to use exceptions. Yeah, users know int and other users also know that it causes tons of problems. That's why strong types were invented.

From other side it may be painful to learn new types. Especially if they do not behave as expected and the users have to read the documentation.

Learning something new can be different, it's just life. TBH, I don't care about people that don't read the documentation, you wanna shoot yourself in a foot, that's your choice.

Figure out what is the minimum buffer size needed to encode the message with 3 entries in the group. In case of nested groups, the total amount of nested group items must be passed to the function.

If you want some feature, create a separate issue, don't mix everything together. Btw, such function was initially considered and discarded due to number of reasons, that doesn't mean it can't be considered again with a more detailed investigation.

ujos commented 4 months ago

From other side it may be painful to learn new types. Especially if they do not behave as expected and the users have to read the documentation.

Learning something new can be different, it's just life. TBH, I don't care about people that don't read the documentation, you wanna shoot yourself in a foot, that's your choice.

I appreciate you did read all the details of the SBE specification and asked the questions to the authors of that specification. Just do not force your users to take that path. You do not learn ASM to write the code in C++. You do not learn the theory of how quick sort is implemented in order to call the std::sort() function.

Let me tell the simple thing, you would agree I believe. It is tremendous hard to present something complicated in the way so kids can understand it. It is hard to come with the simple interface for the thing which is not trivial. I heard the story that the vendor of the Saab car had crashed the business just because their cars were way too complicated to use. Even though they were really great. I think there are even more examples like that.

The C++ API interface (so that other interfaces) IS already a documentation. You come with the names for the functions/classes/variables. When one reads your code, they read the story. If user must read something else (even comments), I believe with all my heart, the author failed to do the proper interface. The interface must be designed in such way so it is easy to use it and is hard to misuse.

If you're succeed in making good interface, your users will never suffer and they will use your product with the pleasure. They will feel you are taking care of them.

I'd say even more. I worked on the CME MDP3/iLink project for a long time both on the exchange side and on the trader side. I used real-logic. It was terrible. So I've created my own parser like sbepp. I just cannot use it on the project I'm working now. So, after a 2 years I left that project and now I do not remember SBE protocol details. I'm making stupid mistakes. I know SBE but still I make mistakes.

You say, they have to learn. Do they? Do they need that knowledge to do their business? I learnt SBE and after 2 years I do not remember it just because that knowledge is somewhere very deep in my brain.