redboltz / mqtt_cpp

Boost Software License 1.0
432 stars 107 forks source link

Property cursor #457

Open redboltz opened 4 years ago

redboltz commented 4 years ago

It is from https://github.com/redboltz/mqtt_cpp/issues/297#issuecomment-527242751.

@jonesmz wrote:

Now that mqtt_cpp is using MQTT_NS::buffer to store all of the items in the properties, maybe we can revisit this discussion.

What do you think about doing just-in-time parsing?

When the packet is received, we can store the data for the properties in an MQTT_NS::buffer the same way we currently do.

That MQTT_NS::buffer can be owned by an MQTT_NS::property_cursor object (In this situation, when I say cursor i mean the word in the same sense that some database engines use it. As a handle to data that will be fetched and/or parsed on demand, but until the parsing is triggered, the data is in an unspecified state).

The MQTT_NS::property_cursor object provides functions like:

struct property_cursor
{
    std::vector<optional<v5::property_varient>> parse_all_properties() { ... };
    optional<v5::property_varient> parse_next() { ... };
    struct iterator
    {
        // Some implementation that parses the property data only when the value is accessed.
    };
    iterator begin();
    iterator end();
    buffer data_;
}

This allows us to completely avoid needing to allocate storage space for the std::vector object.

For the test_broker, it also means that the broker never needs to parse the properties for publish messages, and can just send them to the subscribers directly. No deserialization / parsing, but also no serialization. Just moving blobs of data around (Much more efficient!!)

It also means that code that uses mqtt_cpp can decide if they want to parse the properties or not, on their own.

For example: I have several MQTTv5 messages that do transmit properties (because all of my message send the same list of properties, always), but the code that receives the properties never use them. Right now, mqtt_cpp is parsing the properties, and allocating a std::vector, even though I don't want it to, and don't use the result.

It also means that I would be able to do something like this (note, c++17 syntax, but can achieve the same result in C++14, too):

v5::property_cursor cursor(data from boost::asio);
std::optional<v5::correlation_data> corrData;
for(auto property : cursor)
{
    corrData = std::visit[](auto & prop) -> std::optional<v5::correlation_data>
    {
        if constexpr(std::is_same_v<decltype(prop), v5::correlation_data>)
        { return prop; }
        else
        { return std::nullopt(); }
    }
    if(corrData)
    {
        // We're only looking for one specific property. Exit early.
        break;
    }
}

In the above code, parsing only happens when the property is evaluated, or the iterator is dereferenced (and returns std::nullopt, if it can't be parsed, to indicate failure).

If I'm only interested in a specific property, I don't need to pay the runtime cost of 1) Allocating storage for std::vector, or 2) Parsing all properties. I only have to pay for parsing the properties until I find the one I want.


And, of course, it also means that the debug function signatures is much much smaller.

redboltz commented 4 years ago

I will take a look this. Please wait a moment.

redboltz commented 4 years ago

I think that it is nice to have. Avoiding parse conditionally is valuable.

I'm not sure how this mechanism work with buffer life time management.

Packet format is like as follows:

+-----------+---------------+---+--------------+--------+---
|FixedHeader|RemainingLength|...|PropertyLength|Props...|...
+-----------+---------------+---+--------------+--------+---

If RemainingLength is smaller than packet_bulk_readlimit, then buffer is allocated once. And properties are refer to the buffer. Each property might have buffers, it is sharing the lifetime of the buffer.

If RemainingLength is NOT smaller than packet_bulk_readlimit, PropertyLength is compared to props_bulk_readlimit. If PropertyLength is smaller than props_bulk_readlimit, then buffer is allocated once. If PropertyLength is NOT smaller than props_bulk_readlimit, then use buf_ for property_id. When property_id is read, then read propertybody using buf or newly allocated buffer . It depends on property_id. The mechanism should work well with this process.

redboltz commented 4 years ago

I think that there are three cases:

Case Condition Property Cursor Friendly
1 Read whole packet all at once yes
2 Read all properties all at once yes
3 Read individual property step by step no

I think that most of clients need to parse all properties but broker sometimes doesn't need to parse (e.g. publish message delivery).

For broker, we can introduce force avoid case 3 option for endpoint.

class endpoint {
    void set_property_read_all_at_once(bool b = true);

If the flag is set, then enforce reading properties all at once even if PropertyLength is very big.

It is just an idea.

jonesmz commented 4 years ago

If RemainingLength is smaller than packet_bulk_readlimit, then buffer is allocated once. And properties are refer to the buffer. Each property might have buffers, it is sharing the lifetime of the buffer.

If RemainingLength is NOT smaller than packet_bulk_readlimit, PropertyLength is compared to props_bulk_readlimit. If PropertyLength is smaller than props_bulk_readlimit, then buffer is allocated once. If PropertyLength is NOT smaller than props_bulk_readlimit, then use buf_ for property_id. When property_id is read, then read propertybody using buf or newly allocated buffer . It depends on property_id. The mechanism should work well with this process.

The case of PropertyLength > props_bulk_readlimit is complicated.

This situation requires that we allocate multiple mqtt::buffer objects, which partially defeats the purpose of using the concept of a PropertyCursor.

It's still an improvement because 1) No need to parse / deserialize unless requested by user. 2) smaller stack traces, which is good for debug experiences.

Case Condition Property Cursor Friendly
1 Read whole packet all at once yes
2 Read all properties all at once yes
3 Read individual property step by step no

Another data structure that we can consider is a memory list.

Here's a simplified version of it.

template<typename Alloc>
struct memory_list
{
    using list_t = std::list<mqtt::buffer, Alloc>;
    size_t read_at(size_t idx, size_t count, std::uint8_t * pOutput);
    size_t write_at(size_t idx, size_t count, std::uint8_t * pOutput);

    struct read_cursor
    {
        list_t::iterator currentBuffer;
        size_t currBufferIdx;
        mqtt::buffer::iterator begin();
        mqtt::buffer::iterator end();
    };
    read_cursor
    std::list<mqtt::buffer, Alloc> storage_;
}

With a memory pool of mqtt:buffer objects that stores the mqtt::buffers that are no longer in use by user code, this implementation needs to allocate very rarely, and allows for mqtt_cpp::endpoint to do case 3 every time without any performance concerns.

redboltz commented 4 years ago

I'm not sure where is the goal of your idea. But I want to keep the mechanism v5::property_variant. I think that your idea is NOT removing it.

1) No need to parse / deserialize unless requested by user.

It has an advantage for implementing broker but I guess that most of clients want to parsed properties.

Another point, if properties don't contain the property that have string/binary, then endpoint doesn't need to any memory allocation for property buffer. endpoint (re)uses pre-allocated buf_. *1. Of course, std::vector<v5::property_variant> allocation is required.

I'm not sure your approach keeps *1 advantage ? Or even if the property doesn't contain string/binary, then allocate buffer? If it does, I think that it is not good. I think that I don't understand your idea well yet.

2) smaller stack traces, which is good for debug experiences.

If you are using gdb, then something like https://github.com/ruediger/Boost-Pretty-Printer could help. The current C++ is going to use std::variant more because it is standardized. If std::variant has many entries, the problem is happened. I think that tool help is natural approach. I want to discuss two goals separately. Let's focus on the performance improvement (conditional property parse) here.

redboltz commented 4 years ago

I'd like to know the impact of existing APIs. e.g. publish() family, handlers.

jonesmz commented 4 years ago

Apologies for short answer with bad formatting. Am typing on phone.

The property cursor can have an MQTT::variant.

If case 1 then v5::properties (vector of variant) If case 2 then MQTT::buffer If case 3 then vector

If the entire packet is so small that it fits into buf_ then parsing the data will be very very low cost, anyway.

jonesmz commented 4 years ago

Functions called by end user to send message with properties would take a variant. Instead of v5::properties, they would take variant<v5::properties, v5::property_cursor>

But if v5::property_cursor can be implicitly constructed from v5::properties, then instead it can just take v5::property_cursor. End user won't see a difference.

Handlers would need to change to accept a v5::property_cursor instead of v5::properties.

redboltz commented 4 years ago

Indeed, I understand your approach. Keeping the interface of existing APIs is nice. And breaking change of handlers is acceptable.

If the entire packet is so small that it fits into buf_ then parsing the data will be very very low cost, anyway.

I don't understand this part yet. What happens the case as follows:

The current implementation is allocating one buffer for A and (re)use internal buf (10bytes) for B. I'm afraid that allocating buffers for B instead of using buf. buf is reused again and again. Because if each B is parsed, then the buf is no longer needed. How to keep this mechanism?