Message writing order - Githubissues

benstadin commented 5 years ago

ws_session_base is using a std::vector based message queue defined as:

std::vector<message> mq_;

I understand the sequence as follows:

do_send() adds a message to the back of the queue
do_write() writes the back of the queue
on_write() swaps back and front items of the vector, so that the oldest item is written next and calls another do_write() if the queue is not empty

But what happens in case another send() is called in between (while writing async, before on_write is triggered)? Won't the just added item get removed when the vector is resized in on_write, and the current message send again?

I've changed the queue to use std::deque and change the sequence to this:

do_send() adds a message to the back of the queue
do_write() writes the front of the queue (previously wrote the back)
on_write() simply pops the front item

Is this ok?

In code:

std::deque<Message> _queue;

void doSend(Message m) {
    if(! beast::get_lowest_layer(derived().ws()).socket().is_open())
        return;
    _queue.emplace_back(std::move(m));
    if(_queue.size() == 1)
        doWrite();
}

void doWrite() {
    BOOST_ASSERT(! _queue.empty());
    derived().ws().async_write(_queue.front(),
                               beast::bind_front_handler(&WebsocketSession::onWrite,
                                                         sharedFromBase(),
                                                         _queue.size() - 1));

}

void onWrite(std::size_t idx,
             beast::error_code ec,
             std::size_t) {
    BOOST_ASSERT(! _queue.empty());
    if(ec)
        return fail(ec, "onWrite");

    _queue.pop_front();

    // write next
    if(! _queue.empty())
        doWrite();
}

vinniefalco commented 5 years ago

Won't the just added item get removed when the vector is resized in on_write, and the current message send again?

No, because the code checks for this condition and moves the message that was just sent to the back:

        if(idx != last)
            swap(mq_[idx], mq_[last]);
        mq_.resize(last);

This can cause messages to be delivered out of order sometimes, but order of delivery is not guaranteed or needed for this particular program. I would not use a deque, it has a significant memory cost which is wasteful if the size of the queue is small. The check for is_open is unnecessary, as is_open could transition to false even after the check is performed, with items in the queue.

Just wondering why all the renaming, and why make a separate function out of boost::shared_from(this) (e.g. sharedFromBase)?

benstadin commented 5 years ago

No, because the code checks for this condition and moves the message that was just sent to the back:

Won't the index calculated in do_write - which points to the back index - be invalidated in on_write, where the back index is swapped and now points to the previous item in that situation? Does it make sense to generate a an atomic sequence id when the message is constructed, and use that to find the item to be removed (would also improve ordering as a side effect, even though I understand it is not relevant for this application).

This can cause messages to be delivered out of order sometimes, but order of delivery is not guaranteed or needed for this particular program. I would not use a deque, it has a significant memory cost which is wasteful if the size of the queue is small.

Thanks for the clarification, makes sense. Arguably a chat application should see chat messages in the correct order, but probably based on client side ordering using a timestamp.

In my application message ordering is preferred, but also not guaranteed (sending messages where only the last message with most recent timestamp is relevant, other messages with older timestamps will be simply dismissed).

The check for is_open is unnecessary, as is_open could transition to false even after the check is performed, with items in the queue.

I took the code from the example: https://github.com/vinniefalco/BeastLounge/blob/2b6d5b271e4bf55530d1a16478915f53a855091a/server/ws_user.cpp#L205

Just wondering why all the renaming, and why make a separate function out of boost::shared_from(this) (e.g. sharedFromBase)?

Just to adapt to code style used in my application. All my classes have capital letters and methods are camel case. I do not use much of boost in my existing application and for example I want to avoid to not mix boost boost shared_ptr with std_shared_ptr. When using std::shared_ptr I need for example the following method in the CRTP web socket base class:

        std::shared_ptr<Derived> sharedFromBase() {
            return std::static_pointer_cast<Derived>(derived().getPtr());
        }

vinniefalco commented 5 years ago

Won't the index calculated in do_write - which points to the back index - be invalidated in on_write

The index passed to on_write represents the index of the message that has completed sending. We swap that message with the last message in the queue if it is not already the last one, in order to remove it from the vector without incurring a memmove. This is such a minor optimization and it complicates the code, so really I should probably take it out and just live with mq_.erase(mq_.begin()) since this code is meant to be a tutorial. memmoves are very fast anyway compared to network I/O and the number of messages in the queue will be small (for this program).

I do not use much of boost in my existing application and for example I want to avoid to not mix boost boost shared_ptr with std_shared_ptr.

boost::shared_ptr is striclty better then std::shared_ptr. It has more features, and if your project only requires C++11 (as BeastLounge does) then boost::shared_ptr provides the missing C++17 functionality of std::shared_ptr. Personally, I feel that sharedFromBase() is less readable than boost::shared_from(this), because someone has to look up sharedFromBase to understand what it does whereas you can safely assume that readers will know what boost::shared_from(this) does since it is a standard library component.

boost::enable_shared_from is especially better than std::enable_shared_from_this because it is an ordinary class instead of a template. The consequence is that your program will compile faster, and the resulting executable will be smaller because there will be fewer template instantiations. And you do not need the static pointer cast. See: https://www.boost.org/doc/libs/1_71_0/libs/smart_ptr/doc/html/smart_ptr.html#enable_shared_from

benstadin commented 5 years ago

I think I understand the queueing, but I probably still misunderstand the internal execution model. What I do not understand is why there is another call to do_write in on_write. Are the calls to async_write not queued internally?

in on_write:

        if(! mq_.empty())
            do_write();

So async_write does not do anything when already writing, and we have to schedule another async_write ourself? Also meaning we finish one message after another, and there can't be two async_writes happen concurrently? Then all makes sense. I interpreted it as a retry handler.

If not... When there are two messages on the queue and a third is „inflight“, we’d have a queue like this:

The pairs in the second row should read: “Message 1 was added to the queue at index 0“, just what do_write does. And this index 0 will later be seen in on_write when the message was sent.

Now the third message is added to the back of the queue, before message 1 and 2 finished sending (maybe a not uncommon case because of different sized messages):

Everything looks fine. But now message 2 finished writing and fires on_write. Message msg2 is removed (whether we swap or not makes no difference in this case), leaving msg3 at index 1 (but as I understand it will trigger idx 2 when on_write is called):

Now we get another message msg 4:

Now msg3 is done, but it's not removed from the queue. Instead msg4 is removed, because on_write for msg3 claimed to be at idx 2.

vinniefalco commented 5 years ago

Are the calls to async_write not queued internally?

They are not queued, otherwise Beast would be taking the choice of how to implement the queue away from the user. Thus queuing is the responsibility of the calling program.

Instead msg4 is removed, because on_write for msg3 claimed to be at idx 2.

No that's not correct. Since there is only ever one write happening at a time, there is only completion handler that ever has a bound index. And since the swap can only ever happen after the current message is finished sending, it is not possible for the index to be wrong.

vinniefalco / BeastLounge

Message writing order #19