zeromq / netmq

A 100% native C# implementation of ZeroMQ for .NET
Other
2.97k stars 746 forks source link

Lazy receive sequence is not re-entrant #272

Closed drewnoakes closed 9 years ago

drewnoakes commented 9 years ago

The lazy receive functions described in #64 and implemented in #65 have a potentially surprising pitfall.

Because they yield a sequence of results directly "from the wire", the sequence can only be enumerated once. This is, generally, not true of IEnumerable<T> types throughout the BCL.

Here is our ReceiveMessages method:

public static IEnumerable<byte[]> ReceiveMessages(this IReceivingSocket socket)
{
    bool hasMore = true;

    while (hasMore)
        yield return socket.Receive(false, out hasMore);
}

However user code like this doesn't actually work:

var frames = sock.ReceiveMessages();
if (frames.Any())
{
    var header = frames.First(); // WRONG
}

In this case, Any burns the first item in the sequence, and First actually gives the second frame (or throws if there was only one frame in the message).

What benefit is there of dealing with a sequence? As ZeroMQ only deals in full messages, aren't frames already buffered until a whole message is available on a given socket?

Generally I'm a fan of making things lazy, however when the consequence of multiple enumeration is not simply a performance hit but in fact completely incorrect results, I think laziness is not a good fit.

I propose modifying ReceivingSocketExtensions to always return materialised collections, and do so using IList<byte[]>.

Thoughts?

JamesWHurst commented 9 years ago

I totally agree, and frankly am grateful that you spotted this and raised this issue. It is serious. I submit that we do not want to have a fundamentally-incorrect API like that. I like your solution.

somdoron commented 9 years ago

I think we can mark the method as obsolete, not sure we even need an alternative. There is better way to do that, like ReceiveMessage that returns NetMQMessage On Mar 1, 2015 3:43 AM, "Drew Noakes" notifications@github.com wrote:

The lazy receive functions described in #64 https://github.com/zeromq/netmq/issues/64 and implemented in #65 https://github.com/zeromq/netmq/pull/65 have a potentially surprising pitfall.

Because they yield a sequence of results directly "from the wire", the sequence can only be enumerated once. This is, generally, not true of IEnumerable types throughout the BCL.

Here is our ReceiveMessages method:

public static IEnumerable<byte[]> ReceiveMessages(this IReceivingSocket socket) { bool hasMore = true;

while (hasMore)
    yield return socket.Receive(false, out hasMore);

}

However user code like this doesn't actually work:

var frames = sock.ReceiveMessages(); if (frames.Any()) { var header = frames.First(); // WRONG }

In this case, Any burns the first item in the sequence, and First actually gives the second frame (or throws if there was only one frame in the message).

What benefit is there of dealing with a sequence? As ZeroMQ only deals in full messages, aren't frames already buffered until a whole message is available on a given socket?

Generally I'm a fan of making things lazy, however when the consequence of multiple enumeration is not simply a performance hit but in fact completely incorrect results, I think laziness is not a good fit.

I propose modifying ReceivingSocketExtensions to always return materialised collections, and do so using IList<byte[]>.

Thoughts?

— Reply to this email directly or view it on GitHub https://github.com/zeromq/netmq/issues/272.

drewnoakes commented 9 years ago

@somdoron actually yes, NetMQMessage looks like a better fit in many cases. It allocates more objects (one for the message, and one per frame).

In the PR there's a non-breaking fix that doesn't obsolete anything.

I think we can improve the usability of APIs for receiving from sockets in general. I'll open a new issue with some ideas for discussion.