ponylang / ponyc

Pony is an open-source, actor-model, capabilities-secure, high performance programming language
http://www.ponylang.io
BSD 2-Clause "Simplified" License
5.7k stars 415 forks source link

UDPSocket.writev sends multiple packets #991

Open pdtwonotes opened 8 years ago

pdtwonotes commented 8 years ago

When using a WriteBuffer to construct a message with multiple fields, internally this is represented as an Array[ByteSeq]. That is, WriteBuffer is not actually concatenating the fields into a single byte stream. This Array[ByteSeq] is what gets returned by done().

If I then pass that to UDPSocket.writev, internally writev just iterates over the Array and calls the internal _write function for each chunk. This results in multiple UDP packets being sent. Not at all what I want.

Either there has to be a way to retreive the flattened Array[U8] from WriteBuffer so I can call UDPSocket.write, or UDPSocket.writev has to perform this function itself, so that one call to UDPSocket.write* results in the transmission of one UDP packet.

I am guessing the code was copied from the TCP modules, but UDP semantics are different from TCP semantics, and packet boundaries are significant.

SeanTAllen commented 8 years ago

Can you post your write buffer code?

SeanTAllen commented 8 years ago

As to writev on UDP, I would expect it to send multiple messages.

jemc commented 8 years ago

It looks like @writev (the system call) when called with a UDP socket actually does combine to one datagram:

The readv and writev functions can be used with any descriptor, not just sockets. Also, writev is an atomic operation. For a record-based protocol such as UDP, one call to writev generates a single UDP datagram.

Source

So it seems to me that we should try to match that behaviour if possible, since we are using the same name as the system call.

sylvanc commented 8 years ago

This is indeed a bug. The right thing to do is to actually use @writev in both UDPSocket and TCPConnection. On Windows, the IOCP interfaces we use support scatter-gather, so it's possible there as well.

pdtwonotes commented 8 years ago

Here is the code I am using.

type Proplist is Array[(String,String)]
  be send( data: (Proplist val | String val) ) =>
    """
    Send any message to the other modules.  All messages are encoded in JSON.
    We accept an already formatted string, or a Proplist that we convert.
    """
    // Get the body of the message as a String
    let body = match data
      | let pairs: Proplist val => _plist2JSON( pairs )
      | let s: String val => s
      else ""
      end

    if _tracing then
      Debug.out("Send: " + body)
    end

    // Build a message packet with a length and the string.
    let msglen = body.size().u16()
    Debug.out("Body size "+msglen.string() )
    _wbuffer.u16_be( msglen )
    _wbuffer.write( body )

    match _socket
      | let sock: UDPSocket =>
         match _sendIP
         | let addr: IPAddress =>
               sock.writev( _wbuffer.done(), addr )
         end
       end

I discovered that two packets were being sent, the first being two bytes, matching the u16_be, when the receiving program received them. I then verified this using the wireshark program.

SeanTAllen commented 8 years ago

I'll take care of this

SeanTAllen commented 8 years ago

So what I see as order of importance:

1: have UDP writev use @writev and update documentation so its clear that this will result in a single message (which might be surprising for someone using write buffer to buffer many messages) 2: switch TCP writev to use @writev 3: address windows- I'm not a windows person. Looking for guidance here. 4: possibly address the new UDP writev/write buffer issue where I might want to buffer many messages and send them. is this buffer and then call write yourself? something new? Looking for thoughts here.

pdtwonotes commented 8 years ago

I would attack it from the other end, adding a second fun like done() that returns the flattened value. This does not disrupt people who expect the old behavior from UDPSocket, and increases the general usefulness of WriteBuffer.

jemc commented 8 years ago

I would attack it from the other end, adding a second fun like done() that returns the flattened value. This does not disrupt people who expect the old behavior from UDPSocket, and increases the general usefulness of WriteBuffer.

I'd rather not encourage an "easy path" to bad performance (extra allocations and copying) when we can instead encourage use of writev and fix the underlying issues by making it use @writev directly (which was already on the roadmap before this issue popped up).

This does not disrupt people who expect the old behavior from UDPSocket

We're treating the fact that writev doesn't act like @writev as a principle-of-least-surprise bug. Anyone who is relying on the broken behaviour will have to update their code, but I kind of doubt anyone is, since I would have expected them to be surprised by this behaviour and file this issue just like you did - it's likely that you're the first to make real use of this feature. At any rate, we can't afford to keep this behaviour "surprising", so even if someone is depending on it, I'd still want it to be fixed.

possibly address the new UDP writev/write buffer issue where I might want to buffer many messages and send them. is this buffer and then call write yourself? something new? Looking for thoughts here.

Why not just iterate over the messages you want to send and call write (if each is a ByteSeq) or writev (if each is a ByteSeqIter) for each one?

for m in messages.values() do
  socket.write(m)
end

It seems like this is easy enough to do with the existing APIs provided, so I wouldn't bother adding anything new until someone comes forward with a compelling use case that isn't covered by this.

SeanTAllen commented 4 years ago

This is currently blocked.

anacrolix commented 4 years ago

What's this blocked on?

aturley commented 4 years ago

@SeanTAllen can you elaborate on what the blocker is here?

SeanTAllen commented 4 years ago

I don't know. I removed the blocked label that we removed and added the note.