Regression on flow termination event from 5.02.c to 5.1.1

jordanauge commented 4 years ago

Hi,

It seems flow termination event is not properly raised at the client with v5.1.1 (from master), while this works with v5.02.c (from tag).

With v5.02.c

Server

$ mgen event "listen tcp 5200"
[...]
mgen: version 5.02c
[...]
09:16:59.967811 LISTEN proto>TCP port>5200
09:17:04.045735 ACCEPT src>127.0.0.1/5100 dstPort>5200
09:17:04.046713 RECV proto>TCP flow>1 seq>0 src>127.0.0.1/5100 dst>127.0.0.1/5200 sent>09:17:04.045711 size>65535 gps>INVALID,999.000000,999.000000,4294966297 flags>0x01
[...]
09:17:05.046231 OFF src>127.0.0.1/5100 dstPort>5200

Client

$ mgen event "ON 1 TCP DST 127.0.0.1/5200 SRC 5100 PERIODIC [1 1000000.0] COUNT 1"
[...]
mgen: version 5.02c
[...]
08:58:09.981302 START Mgen Version 5.02c
enter ProtoSocket::Connect() ...
08:58:09.981461 ON flow>1 srcPort>5100 dst>127.0.0.1/5200 
08:58:09.981478 CONNECT flow>1 srcPort>5100 dst>127.0.0.1/5200 
08:58:10.981676 SHUTDOWN flow>1 srcPort>5100 dst>127.0.0.1/5200 
08:58:10.981922 OFF flow>1 srcPort>5100 dst>127.0.0.1/5200 
08:58:10.982008 STOP

With v5.1.1 (master)

Server

$ mgen event "listen tcp 5200"
[...]
mgen: version 5.1.1
[...]
09:14:47.577689 START Mgen Version 5.1.1
09:14:47.577975 LISTEN proto>TCP port>5200
09:14:51.282399 ACCEPT src>127.0.0.1/5100 dstPort>5200
09:14:51.283418 RECV proto>TCP flow>1 seq>0 src>127.0.0.1/5100 dst>127.0.0.1/5200 sent>09:14:51.282379 size>65535 gps>INVALID,999.000000,999.000000,4294966297 flags>0x01 
[...]

(last message corresponds to the final chunk)

Client

$ mgen event "ON 1 TCP DST 127.0.0.1/5200 SRC 5100 PERIODIC [1 1000000.0] COUNT 1"
[...]
mgen: version 5.1.1
[...]
09:42:57.449319 START Mgen Version 5.1.1
enter ProtoSocket::Connect() ...
09:42:57.449463 ON flow>1 srcPort>5100 dst>127.0.0.1/5200 
09:42:57.449483 CONNECT flow>1 srcPort>5100 dst>127.0.0.1/5200

(hangs here)

jordanauge commented 4 years ago

It seems StopFlow() has been removed from OnTxTimeout(), is that legitimate ? Does it also mean that the timestamp of the end of the flow is affected by the message interarrival rate ?

ljt-git commented 4 years ago

On 10/26/2017 the count behavior was changed in "r765: new SEED global command and COUNT rework - now flows will ONLY be turned off when an OFF event is received. Tcp sockets will NOT disconnect after count packets are sent" .

IIRC this was because we are doing more "remote control" of mgen - where "mgen actors" tell a mgen instance when to "act".

For example, a client managed by a mgen instance may establish a connection to a mgen server and send 3 packets. Later on in a "remote command" may tell this instance to send 5 more packets:

./mgen instance mgen "ON 1 TCP SRC 35219 DST 192.168.1.5/5000 periodic [1 1024] count 3"

./mgen instance mgen "MOD 1 COUNT 5"

Similarly, multiple "flows" can be share the same TCP connection so just because we shut down one flow after count packets are sent - we don't necesarily want to tear down the whole tcp connection.

You may try explicitly using an OFF event or if you need some indication of when COUNT packets are sent by a specific flow we could consider adding a log event for this.

jordanauge commented 4 years ago

We are using mgen to generate dynamic arrivals of finite size flows. mgen seems among the only tools able to manage such traffic pattern, and although it does not directly support it, it is possible to generate a mgen script after randomly drawing the different flow sizes and interarrival times (eg. according to an exponential). Then each flow has some fixed amount of data to transfer, and will only be regulated by TCP.

In this context, an important metric is the average flow completion time (FCT) (see for instance [1] and its reference [9]), which depends on the traffic mix (how many flows are in progress at each instant, how they share bandwidth, etc.). It is thus not possible to deterministically specify an OFF event and we need to know when the flow has effectively finished sending the specified amount of traffic.

A good way to compute the FCT from the sender is to leverage the connection shutdown process (based on FIN packet exchange) which is correctly done in mgen. Without this, it is hard to track when the transfer has effectively completed (due to socket buffers etc.). I am not sure how such an event would be generated otherwise but that could be part of the solution.

A remaining issue would then be the correct time when to issue the OFF event in order to free resources up after the flow has completed. We tend to generate a large number of flows (in case of high link loads, or because we need long experiments to get significant results). Keeping them all open would cause troubles, and I see no easy way to determine in advance when it is possible to reuse a flow.

A simple solution would be to explicitely tell mgen whether we want to keep the connection open for further control, or shut it down after it has sent all specified messages (the former COUNT semantic).

[1] http://yuba.stanford.edu/techreports/TR05-HPNG-112102.pdf

ljt-git commented 4 years ago

Hello Jordan - I have a fix for the problem but am now running into a segfault issue introduced in a recent commit that I am tracking down fyi.

ljt-git commented 4 years ago

I checked in a fix for this into master.

Use the command :

COUNT N,OFF

To disable flow "keep alive"

USNavalResearchLaboratory / mgen