devhawala / dodo

Xerox Network Services (XNS) implemented in Java
Other
13 stars 1 forks source link

XIP packets merged in TCP segment #13

Open nbriggs opened 1 year ago

nbriggs commented 1 year ago

From the receiving side, it looks as though the NetHub send on the TCP connection may occasionally do a single write to the TCP socket with a buffer containing multiple XIP packets.

The Dodo code has NOT set the TCP_NOPUSH bit (since everything is coming through with the push bit set), so the expected behavior would be a separate TCP packet for each write() or send().

It's not a bug, but it does mean that the receiving code needs to be a little more complicated to cope with it (as you implemented in maiko's ether_nethub.c)

Packet 3 below contains two XIP packets in response to packets 1 and 2, packets 4 and 5 got separate responses in 6 and 7.

    1   0.000000    127.0.0.1 → 127.0.0.1    TCP 138 59935 → 3994 [PSH, ACK] Seq=1 Ack=1 Win=33792 Len=62 TSval=1901363668 TSecr=1900464294
    2   0.013109    127.0.0.1 → 127.0.0.1    TCP 138 59935 → 3994 [PSH, ACK] Seq=63 Ack=1 Win=33792 Len=62 TSval=1901363669 TSecr=1900464294
    3   0.183270    127.0.0.1 → 127.0.0.1    TCP 200 3994 → 59935 [PSH, ACK] Seq=127565 Ack=4294889683 Win=65440 Len=124 TSval=1901363686 TSecr=1900464294
    4  15.184678    127.0.0.1 → 127.0.0.1    TCP 138 59935 → 3994 [PSH, ACK] Seq=125 Ack=1 Win=33792 Len=62 TSval=1901365186 TSecr=1900464294
    5  15.184771    127.0.0.1 → 127.0.0.1    TCP 138 59935 → 3994 [PSH, ACK] Seq=187 Ack=1 Win=33792 Len=62 TSval=1901365186 TSecr=1900464294
    6  15.311808    127.0.0.1 → 127.0.0.1    TCP 138 3994 → 59935 [PSH, ACK] Seq=127689 Ack=4294889683 Win=65440 Len=62 TSval=1901365199 TSecr=1900464294
    7  15.321862    127.0.0.1 → 127.0.0.1    TCP 138 3994 → 59935 [PSH, ACK] Seq=127751 Ack=4294889683 Win=65440 Len=62 TSval=1901365200 TSecr=1900464294
devhawala commented 1 year ago

Thanks for the analysis.

But i did not find a (documented or not) way to set the TCP_NOPUSH usage in Java for sockets. This was not a problem for emulators in the Java or .Net worlds, as these use a more stream oriented approach for reading data, first reading 2 bytes to get the length and then reading an exact number of bytes, so "packaging" the data into packets at TCP level is hidden.

Maybe enabling TCP_NODELAY on the Java sockets in the NetHub will help to keep logical packets (XIP + 2 length bytes) separated at TCP level (this already solved a similar problem in another emulation project). The PSH marker would probably still appear, but the packets may stay separated and not be merged, i will try this one of these days.

But even then, the work-around in Maiko's ether_netehub.c should better stay, just to be sure...

nbriggs commented 1 year ago

TCP_NOPUSH is already set the way it should be (which is NOT set, the default mode, so that you DO get PUSH for every write) and given that, it appeared to be the case that something was coalescing the writes, since it should send a segment for each write() when TCP_NOPUSH is not set. Yeah, the number of negations is terrible!

But... since it's a TCP stream rather than UDP, the kernel should be happy to retain segments and let the code read the length and then read the data as you would do in Java/.NET.

nbriggs commented 1 year ago

There is no issue with doing it by

first reading 2 bytes to get the length and then reading an exact number of bytes, so "packaging" the data into packets at TCP level is hidden.

It works fine, I changed the code and tested it. I haven't committed it back to the maiko repository yet, there's still a bunch of cleanup to do.