statefull control over rtt+ipg degrades <=10ms (results in pkts within a flow sent too quickly with same timestamp)

cisco-system-traffic-generator / trex-core

trex-core site

https://trex-tgn.cisco.com/

Other

1.32k stars 464 forks source link

statefull control over rtt+ipg degrades <=10ms (results in pkts within a flow sent too quickly with same timestamp) #143

Open mcallaghan-sandvine opened 6 years ago

mcallaghan-sandvine commented 6 years ago

== Context ==

Have been evaluating accuracy of TRex stateful mode (for primary use of traffic profiles), and this report has focus on rtt/ipg (continuation of #142 and #137). - EDIT/CLARITY: this was not for Advanced Stateful (ASTF) -- just normal stateful.

Goal was to assess how accurate TRex is for control over rtt/ipg - for these results we kept rtt==ipg for simplicity (despite #137/#142), and executed a series of tests with controlled imputs, varying rtt/ipg.

== Method ==

Config: statefull, single client, 1 connection per second
Traffic: IPv4, simple (read: standard) TCP flow HTTP_GET
ipg/rtt: varied: 100ms, 10ms, 1ms, 100us, 10us

Using TRex stateful mode, YAML config, and .pcap, send a single TCP flow, 1cps, for 30s, capture all packets at machine-in-the-middle between TRex client/server ports. Using tshark, extract the tcp.time_delta for every packet (excluding first SYN packet), and observe the accuracy of time_delta compared to the requested ipg/rtt configuration. Calculate standard deviation and coefficient of variation to compare across all sample sets.

== Summary ==

*Note: "accuracy tolerance" is undefined as far as I can tell from TRex documentation, so this report is predicated on my expectations of tolerance and evaluated intuitively whether or not to be acceptable within margins. (none-the-less, raw data is supplied so that the core team can make their own assessments and conclusions)

High level takeaway for accuracy:

100ms is within tolerance (<1% CV)
10ms starts to show issues, though possibly "acceptable at aggregate scale" (~5-7% CV)
1ms unacceptable (~55% CV)
<1ms is unusable (!<! 0% CV)

The "issues" observed here is that as rtt/ipg is lowered, the probability that TRex sends >1 pkt within a flow at the same time increases. So much so, that below 1ms rtt/ipg, MOST packets within a flow are sent at the same time, rendering the tool's output useless.

== Raw Data ==

TRex configuration:

~/trex/v2.43$ cat ./one_flow.yaml
- duration : 9999
  generator :
          distribution : "seq"
          clients_start : "4.0.0.1"
          clients_end   : "4.0.0.1"
          servers_start : "5.0.0.1"
          servers_end   : "5.0.20.255"
  cap_ipg    : false
  cap_info :
     - name: trex-temp/v4_TCP_http_get_foo.cap
       w   : 1
       cps : 1
       ipg : X
       rtt : X

(-dur is overridden at CLI, same client always, client_src_port changes, server IP changes)

TRex invocation:

sudo ./t-rex-64 -f ./one_flow.yaml -c 1 -m 1 -d 30

(single flow, single core, 1x multiplier, send for 30 seconds so ~30x instances of the flow)

Source Flow (pcap): The flow used for this test is a standard (read: typical) IPv4, TCP, HTTP_GET flow. Specifically for youtube (attached the flow used). Note that the timestamps in that flow are all 0.000000, this is due to the nature in how we generated (forged) the flow. This detail should be irrelevant however, since we're having TRex control rtt/ipg as per template configuration, (which is proven to work fine in 100ms scenario), and it ignores the packet timestamps in the file anyway.

(had trouble attaching it, simplistic view:)

tshark Data Gather:

From the capture per test iteration

Used the following tshark invocation to parse it, and this was dumped into the LibreOffice Spreadsheet for mathematical processing.

tshark -r ./trex_tcp_sample_10us_rtt_C.pcap -Y 'ip.addr == 4.0.0.1' -T fields -e tcp.time_delta -Y "!(tcp.flags == 0x002)"

Filter on the client IP, though that's all we sent from anyway, display the tcp.time_delta (our primary target for analysis), and never include the first SYN packet from the flow because time delta for that is always zero and is not of interest to our analysis.

The Math

See attached mathematical_comparisons_of_accuracy_tcp.time_delta.ods. For each series of tests, I conducted THREE iterations of it (A, B, and C). C was ultimately the "most clean and consistent", but A and B show similar results even with smaller datasets.

note: no "clean" way to show the summary data in github issue, so just open the ODS file
note: the CV (aka relative standard deviation) was so bad for <1ms, that I had to avoid divide-by-zero because mean was 0.0 :( -- so forged it to 0.000000001).

SAMPLESET A (FIRST CAPTURES, RANDOM DURATIONS)

  | 100ms | 10ms | 1ms | 100us | 10us
ABSOLUTE | 0.100000000 | 0.010000000 | 0.001000000 | 0.000100000 | 0.000010000
MIN | 0.097110000 | 0.007603000 | 0.000095000 | 0.000000000 | 0.000000000
MAX | 0.102918000 | 0.011907000 | 0.003504000 | 0.001002000 | 0.001001000
AVG | 0.099930085 | 0.010001136 | 0.001016898 | 0.000135169 | 0.000084000
MEAN | 0.100013000 | 0.010001000 | 0.001000000 | 0.00000000 | 0.00000000
STD_DV | 0.000800865 | 0.000691828 | 0.000543267 | 0.000298314 | 0.000203923
RSD | 0.80% | 6.92% | 54.33% | 29831354.44% | 20392332.91%

SAMPLESET B (SECOND SET, MOSTLY 10s durs, one 20s)

  | 100ms | 10ms | 1ms | 100us | 10us
ABSOLUTE | 0.10000000 | 0.01000000 | 0.00100000 | 0.00010000 | 0.00001000
MIN | 0.09841500 | 0.00789800 | 0.00009600 | 0.00000000 | 0.00000000
MAX | 0.10141100 | 0.01290300 | 0.00360200 | 0.00248700 | 0.00100000
AVG | 0.09989641 | 0.00995364 | 0.00101673 | 0.00017895 | 0.00007022
MEAN | 0.10001300 | 0.01000100 | 0.00100000 | 0.00000000 | 0.00000000
STD_DV | 0.00050270 | 0.00052320 | 0.00055397 | 0.00042885 | 0.00017322
RSD | 0.50% | 5.23% | 55.40% | 42884843.49% | 17322230.76%

  | 100ms | 10ms | 1ms | 100us | 10us
-- | -- | -- | -- | -- | --
ABSOLUTE | 0.10000000 | 0.01000000 | 0.00100000 | 0.00010000 | 0.00001000
MIN | 0.09641300 | 0.00730300 | 0.00009700 | 0.00000000 | 0.00000000
MAX | 0.10271500 | 0.01189900 | 0.00389800 | 0.00100000 | 0.00100100
AVG | 0.09989308 | 0.00993159 | 0.00103373 | 0.00010114 | 0.00005339
MEAN | 0.10001300 | 0.01000100 | 0.00100000 | 0.00000000 | 0.00000000
STD_DV | 0.00085254 | 0.00052742 | 0.00065473 | 0.00025019 | 0.00013419
RSD | 0.85% | 5.27% | 65.47% | 25018892.57% | 13419196.45%

== Appendix ==

Initially I had conducted the analysis for tcp.analysis.ack_rtt, however this was a big failure since MANY of the TCP control packets are sent at the same time, so the analysis capabilities are ruined :( -- none-the-less, included the data anyway (attached as mathematical_comparisons_of_accuracy_tcp.analysis.ack_rtt.ods). It was after this that I refocused on time delta which was more direct to my goal anyway.

mcallaghan-sandvine commented 6 years ago

(unable to UPLOAD into github issue :( -- so uploaded into a temp branch in my fork) https://github.com/mcallaghan-sandvine/trex-core/tree/issue_143_temp_files/issue_143_temp_files

hhaim commented 6 years ago

This is not a real issue. With ASTF there is a flag called accurate-schedule that “solve” this. The issue that we gather 32 packets in table and flush them together. It scheduled every 1msec.

In reasonable traffic rate (more than 1Gb) the accuracy will be about 10usec. The problem that you won’t be able to capture using tshark in high rate.

Try to enable DPI on your DUT and verify it.

Thanks, Hanoh

On Fri, 24 Aug 2018 at 0:26 Matt Callaghan notifications@github.com wrote:

(unable to UPLOAD into github issue :( -- so uploaded into a temp branch in my fork)

https://github.com/mcallaghan-sandvine/trex-core/tree/issue_143_temp_files/issue_143_temp_files

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cisco-system-traffic-generator/trex-core/issues/143#issuecomment-415576395, or mute the thread https://github.com/notifications/unsubscribe-auth/AMbjvTCEhitvw8JhALOFACqjNI8hLSG5ks5uTx4MgaJpZM4WKX0Z .

-- Hanoh Sent from my iPhone

mcallaghan-sandvine commented 6 years ago

This is not ASTF, just normal stateful. (as per https://trex-tgn.cisco.com/trex/doc/trex_manual.html)

PS: indeed captures were taken using machine-in-the-middle (where "machine" is our DPI hardware platform)

hhaim commented 6 years ago

It is not exposed in STF. We are working on ASTF right now.

Thanks, Hanoh

On Fri, 24 Aug 2018 at 0:39 Matt Callaghan notifications@github.com wrote:

This is not ASTF, just normal stateful.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/cisco-system-traffic-generator/trex-core/issues/143#issuecomment-415579675, or mute the thread https://github.com/notifications/unsubscribe-auth/AMbjvVrK8QohyyYvLdN7sehOqkn6gCJmks5uTyD5gaJpZM4WKX0Z .

-- Hanoh Sent from my iPhone

mcallaghan-sandvine commented 6 years ago

this is an accuracy issue with stateful based on my results

what do you mean by "we are working on ASTF right now"? (is stateful deprecated or something?) -> doubts expanded in https://groups.google.com/forum/#!topic/trex-tgn/mkSgXUBVtiA

does your comment about "at reasonable rates, accuracy is ~10us" apply to normal stateful?

mcallaghan-sandvine commented 6 years ago

Of subsequent note, I just tested using cap_ipg=true, on a different TCP HTTP_GET flow where it had RTT ~250-~800us, and inter-packet-delay=0, otherwise-same TRex config/conditions, and the accuracy of packets output was similar (all over the map) -- each time TRex sent that flow, the pkt delays were inconsistent


    SRC             A                    B
MIN 0.00000000  0.00000000  0.00000000
MAX 0.00079900  0.00390000  0.00239900
AVG 0.00032300  0.00038742  0.00036529
MEAN    0.00020150  0.00009600  0.00010000
STD_DV  0.00029848  0.00076655  0.00059123
RSD   148.13%        798.49%      591.23%

mcallaghan-sandvine commented 6 years ago

FYI: I have now disproven the "TRex is accurate at higher bitrates"

New tests:

invocation:

sudo ./t-rex-64 -f ./LAB-5701_limitation_pkts_sizes_etc/one_flow.yaml -c 4 -m 10 -d 999

config:

one_flow.yaml
- duration : 9999
  generator :
          distribution : "seq"
          clients_start : "4.0.0.1"
          clients_end   : "4.0.0.99"
          servers_start : "5.0.0.1"
          servers_end   : "5.0.20.255"
  cap_ipg  : true
  cap_info :
     - name: http_get_1-1mbytes.pcap
       w   : 1
       cps : 10
       ipg : 1000
       rtt : 1000

This 1M-byte file is a SINGLE flow, simple HTTP_GET + a bunch of HTTP data continuation pkts

$ tshark -Q -z io,stat,60 -r LAB-5701_limitation_pkts_sizes_etc/http_get_1-1mbytes.pcap

=====================================
| IO Statistics                     |
|                                   |
| Duration: 8.386 secs              |
| Interval: 8.386 secs              |
|                                   |
| Col 1: Frames and bytes           |
|-----------------------------------|
|                |1                 |
| Interval       | Frames |  Bytes  |
|-----------------------------------|
| 0.000 <> 8.386 |   1161 | 1077722 |
=====================================

SOURCE PCAP: https://github.com/mcallaghan-sandvine/trex-core/blob/issue_143_temp_files/issue_143_temp_files/http_get_1-1mbytes.zip

It's quite simple, take the first 10 pkts or so:

$ tshark -r http_get_1-1mbytes.pcap  | head
    1   0.000000      4.0.0.0 → 5.0.0.0      TCP 74 31193 → 9907 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=16 SACK_PERM=1 TSval=92887655 TSecr=0
    2   0.000734      5.0.0.0 → 4.0.0.0      TCP 74 9907 → 31193 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1460 WS=16 SACK_PERM=1 TSval=3338110980 TSecr=92887655
    3   0.000997      4.0.0.0 → 5.0.0.0      TCP 66 31193 → 9907 [ACK] Seq=1 Ack=1 Win=8688 Len=0 TSval=92887659 TSecr=3338110980
    4   0.004997      4.0.0.0 → 5.0.0.0      HTTP 608 GET /firefox?client=firefox-a&rls=org.mozilla:en-US:official HTTP/1.1 
    5   0.009730      5.0.0.0 → 4.0.0.0      HTTP 604 HTTP/1.1 302 Found  (text/html)
    6   0.016944      5.0.0.0 → 4.0.0.0      HTTP 1514 Continuation
    7   0.017000      4.0.0.0 → 5.0.0.0      TCP 66 31193 → 9907 [ACK] Seq=543 Ack=1987 Win=7232 Len=0 TSval=92887740 TSecr=3338111005
    8   0.021750      5.0.0.0 → 4.0.0.0      HTTP 1514 Continuation
    9   0.022001      4.0.0.0 → 5.0.0.0      TCP 66 31193 → 9907 [ACK] Seq=543 Ack=3435 Win=8688 Len=0 TSval=92887764 TSecr=3338111005
   10   0.033753      5.0.0.0 → 4.0.0.0      HTTP 1514 Continuation

, and the last bit of the flow:

$ tshark -r http_get_1-1mbytes.pcap  | tail
 1152   8.253096      4.0.0.0 → 5.0.0.0      TCP 66 31193 → 9907 [ACK] Seq=543 Ack=996763 Win=7232 Len=0 TSval=92928902 TSecr=3338151936
 1153   8.264448      5.0.0.0 → 4.0.0.0      HTTP 1514 Continuation
 1154   8.265100      4.0.0.0 → 5.0.0.0      TCP 66 31193 → 9907 [ACK] Seq=543 Ack=998211 Win=8688 Len=0 TSval=92928962 TSecr=3338151937
 1155   8.276451      5.0.0.0 → 4.0.0.0      HTTP 1514 Continuation
 1156   8.377325      4.0.0.0 → 5.0.0.0      TCP 66 31193 → 9907 [ACK] Seq=543 Ack=999659 Win=8688 Len=0 TSval=92929522 TSecr=3338152116
 1157   8.384470      5.0.0.0 → 4.0.0.0      HTTP 946 Continuation
 1158   8.385325      4.0.0.0 → 5.0.0.0      TCP 66 31193 → 9907 [FIN, ACK] Seq=543 Ack=1000539 Win=7808 Len=0 TSval=92929561 TSecr=3338152858
 1159   8.385461      5.0.0.0 → 4.0.0.0      TCP 66 9907 → 31193 [ACK] Seq=1000539 Ack=544 Win=8192 Len=0 TSval=3338152897 TSecr=92929561
 1160   8.385634      5.0.0.0 → 4.0.0.0      TCP 66 9907 → 31193 [FIN, ACK] Seq=1000539 Ack=544 Win=8192 Len=0 TSval=3338152898 TSecr=92929561
 1161   8.386324      4.0.0.0 → 5.0.0.0      TCP 66 31193 → 9907 [ACK] Seq=544 Ack=1000540 Win=7792 Len=0 TSval=92929567 TSecr=3338152898

The sad outcome from TRex is that all sorts of packets are sent mixed up and without proper pcap inter-packet-delay (or RTT).

It does at least send all the packets we expect in the flow:

$ tshark -Q -z io,stat,60 -r trex_sampling_single_host_1Gbps_FIRST_FLOW.pcap

=====================================
| IO Statistics                     |
|                                   |
| Duration: 8.385 secs              |
| Interval: 8.385 secs              |
|                                   |
| Col 1: Frames and bytes           |
|-----------------------------------|
|                |1                 |
| Interval       | Frames |  Bytes  |
|-----------------------------------|
| 0.000 <> 8.385 |   1161 | 1077722 |
=====================================

But the timings are all off

$ tshark -r trex_sampling_single_host_1Gbps_FIRST_FLOW.pcap | head -20
    1   0.000000      4.0.0.1 → 5.0.0.1      0 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 74 41668 → 9907 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=16 SACK_PERM=1 TSval=92887655 TSecr=0
    2   0.000000      4.0.0.1 → 5.0.0.1      1 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=1 Ack=1 Win=8688 Len=0 TSval=92887659 TSecr=3338110980
    3   0.000691      5.0.0.1 → 4.0.0.1      0 Sandvine_19:f8:6b → Sandvine_19:f8:ea TCP 74 9907 → 41668 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1460 WS=16 SACK_PERM=1 TSval=3338110980 TSecr=92887655
    4   0.004688      4.0.0.1 → 5.0.0.1      1 Sandvine_19:f8:6b → Sandvine_19:f8:eb HTTP 608 GET /firefox?client=firefox-a&rls=org.mozilla:en-US:official HTTP/1.1 
    5   0.008998      5.0.0.1 → 4.0.0.1      1 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 604 HTTP/1.1 302 Found  (text/html)
    6   0.015992      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=1987 Win=7232 Len=0 TSval=92887740 TSecr=3338111005
    7   0.015992      5.0.0.1 → 4.0.0.1      539 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation
    8   0.020998      5.0.0.1 → 4.0.0.1      1987 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
    9   0.021991      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=3435 Win=8688 Len=0 TSval=92887764 TSecr=3338111005
   10   0.032998      5.0.0.1 → 4.0.0.1      3435 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
   11   0.044996      5.0.0.1 → 4.0.0.1      4883 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
   12   0.045995      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=6331 Win=7232 Len=0 TSval=92887884 TSecr=3338111064
   13   0.056997      5.0.0.1 → 4.0.0.1      6331 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
   14   0.057996      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=7779 Win=8688 Len=0 TSval=92887944 TSecr=3338111064
   15   0.068999      5.0.0.1 → 4.0.0.1      7779 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
   16   0.081001      5.0.0.1 → 4.0.0.1      9227 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
   17   0.081999      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=10675 Win=7232 Len=0 TSval=92888064 TSecr=3338111089
   18   0.093002      5.0.0.1 → 4.0.0.1      10675 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
   19   0.094000      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=12123 Win=8688 Len=0 TSval=92888124 TSecr=3338111209
   20   0.105004      5.0.0.1 → 4.0.0.1      12123 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation

mid-flow issues

$ tshark -r trex_sampling_single_host_1Gbps_FIRST_FLOW.pcap | head -180 | tail -20
  161   1.125038      5.0.0.1 → 4.0.0.1      135203 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
  162   1.125939      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=136651 Win=7232 Len=0 TSval=92893284 TSecr=3338116309
  163   1.137040      5.0.0.1 → 4.0.0.1      136651 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
  164   1.137939      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=138099 Win=8688 Len=0 TSval=92893344 TSecr=3338116429
  165   1.149943      5.0.0.1 → 4.0.0.1      138099 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
  166   1.161945      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=140995 Win=7232 Len=0 TSval=92893464 TSecr=3338116489
  167   1.161945      5.0.0.1 → 4.0.0.1      139547 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation
  168   1.173945      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=142443 Win=8688 Len=0 TSval=92893524 TSecr=3338116490
  169   1.173945      5.0.0.1 → 4.0.0.1      140995 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation
  170   1.185947      5.0.0.1 → 4.0.0.1      142443 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
  171   1.197949      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=145339 Win=7232 Len=0 TSval=92893644 TSecr=3338116669
  172   1.197949      5.0.0.1 → 4.0.0.1      143891 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation
  173   1.209950      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=146787 Win=8688 Len=0 TSval=92893704 TSecr=3338116669
  174   1.209950      5.0.0.1 → 4.0.0.1      145339 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation
  175   1.221952      5.0.0.1 → 4.0.0.1      146787 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation
  176   1.233954      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=149683 Win=7232 Len=0 TSval=92893824 TSecr=3338116849
  177   1.233954      5.0.0.1 → 4.0.0.1      148235 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation
  178   1.245955      4.0.0.1 → 5.0.0.1      543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=151131 Win=8688 Len=0 TSval=92893885 TSecr=3338116850
  179   1.245955      5.0.0.1 → 4.0.0.1      149683 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation
  180   1.257956      5.0.0.1 → 4.0.0.1      151131 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation

The pattern is reproducible. https://github.com/mcallaghan-sandvine/trex-core/blob/issue_143_temp_files/issue_143_temp_files/trex_sampling_single_host_1Gbps_FIRST_FLOW.pcap

hhaim commented 6 years ago

This is expected in STF mode. Read carefully the notes in 4.1 STF manual. Either disable pcap ipg or use offline tool to create RTT between c->s or use ASTF

Hanoh

On Tue, 25 Sep 2018 at 22:25 Matt Callaghan notifications@github.com wrote:

FYI: I have now disproven the "TRex is accurate at higher bitrates"

New tests:

invocation:

sudo ./t-rex-64 -f ./LAB-5701_limitation_pkts_sizes_etc/one_flow.yaml -c 4 -m 10 -d 999

config:

one_flow.yaml

duration : 9999 generator : distribution : "seq" clients_start : "4.0.0.1" clients_end : "4.0.0.99" servers_start : "5.0.0.1" servers_end : "5.0.20.255" cap_ipg : true cap_info :

name: http_get_1-1mbytes.pcap w : 1 cps : 10 ipg : 1000 rtt : 1000

This 1M-byte file is a SINGLE flow, simple HTTP_GET + a bunch of HTTP data continuation pkts

$ tshark -Q -z io,stat,60 -r LAB-5701_limitation_pkts_sizes_etc/http_get_1-1mbytes.pcap

===================================== IO Statistics

Duration: 8.386 secs

Interval: 8.386 secs

Col 1: Frames and bytes

-----------------------------------

1

Interval Frames Bytes

-----------------------------------

0.000 <> 8.386 1161 1077722

=====================================

SOURCE PCAP:

https://github.com/mcallaghan-sandvine/trex-core/blob/issue_143_temp_files/issue_143_temp_files/http_get_1-1mbytes.zip

It's quite simple, take the first 10 pkts or so:

$ tshark -r http_get_1-1mbytes.pcap | head 1 0.000000 4.0.0.0 → 5.0.0.0 TCP 74 31193 → 9907 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=16 SACK_PERM=1 TSval=92887655 TSecr=0 2 0.000734 5.0.0.0 → 4.0.0.0 TCP 74 9907 → 31193 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1460 WS=16 SACK_PERM=1 TSval=3338110980 TSecr=92887655 3 0.000997 4.0.0.0 → 5.0.0.0 TCP 66 31193 → 9907 [ACK] Seq=1 Ack=1 Win=8688 Len=0 TSval=92887659 TSecr=3338110980 4 0.004997 4.0.0.0 → 5.0.0.0 HTTP 608 GET /firefox?client=firefox-a&rls=org.mozilla:en-US:official HTTP/1.1 5 0.009730 5.0.0.0 → 4.0.0.0 HTTP 604 HTTP/1.1 302 Found (text/html) 6 0.016944 5.0.0.0 → 4.0.0.0 HTTP 1514 Continuation 7 0.017000 4.0.0.0 → 5.0.0.0 TCP 66 31193 → 9907 [ACK] Seq=543 Ack=1987 Win=7232 Len=0 TSval=92887740 TSecr=3338111005 8 0.021750 5.0.0.0 → 4.0.0.0 HTTP 1514 Continuation 9 0.022001 4.0.0.0 → 5.0.0.0 TCP 66 31193 → 9907 [ACK] Seq=543 Ack=3435 Win=8688 Len=0 TSval=92887764 TSecr=3338111005 10 0.033753 5.0.0.0 → 4.0.0.0 HTTP 1514 Continuation

, and the last bit of the flow:

$ tshark -r http_get_1-1mbytes.pcap | tail 1152 8.253096 4.0.0.0 → 5.0.0.0 TCP 66 31193 → 9907 [ACK] Seq=543 Ack=996763 Win=7232 Len=0 TSval=92928902 TSecr=3338151936 1153 8.264448 5.0.0.0 → 4.0.0.0 HTTP 1514 Continuation 1154 8.265100 4.0.0.0 → 5.0.0.0 TCP 66 31193 → 9907 [ACK] Seq=543 Ack=998211 Win=8688 Len=0 TSval=92928962 TSecr=3338151937 1155 8.276451 5.0.0.0 → 4.0.0.0 HTTP 1514 Continuation 1156 8.377325 4.0.0.0 → 5.0.0.0 TCP 66 31193 → 9907 [ACK] Seq=543 Ack=999659 Win=8688 Len=0 TSval=92929522 TSecr=3338152116 1157 8.384470 5.0.0.0 → 4.0.0.0 HTTP 946 Continuation 1158 8.385325 4.0.0.0 → 5.0.0.0 TCP 66 31193 → 9907 [FIN, ACK] Seq=543 Ack=1000539 Win=7808 Len=0 TSval=92929561 TSecr=3338152858 1159 8.385461 5.0.0.0 → 4.0.0.0 TCP 66 9907 → 31193 [ACK] Seq=1000539 Ack=544 Win=8192 Len=0 TSval=3338152897 TSecr=92929561 1160 8.385634 5.0.0.0 → 4.0.0.0 TCP 66 9907 → 31193 [FIN, ACK] Seq=1000539 Ack=544 Win=8192 Len=0 TSval=3338152898 TSecr=92929561 1161 8.386324 4.0.0.0 → 5.0.0.0 TCP 66 31193 → 9907 [ACK] Seq=544 Ack=1000540 Win=7792 Len=0 TSval=92929567 TSecr=3338152898

The sad outcome from TRex is that all sorts of packets are sent mixed up and without proper pcap inter-packet-delay (or RTT).

It does at least send all the packets we expect in the flow:

$ tshark -Q -z io,stat,60 -r trex_sampling_single_host_1Gbps_FIRST_FLOW.pcap

===================================== IO Statistics

Duration: 8.385 secs

Interval: 8.385 secs

Col 1: Frames and bytes

-----------------------------------

1

Interval Frames Bytes

-----------------------------------

0.000 <> 8.385 1161 1077722

=====================================

But the timings are all off

$ tshark -r trex_sampling_single_host_ 1 0.000000 4.0.0.1 → 5.0.0.1 2 0.000000 4.0.0.1 → 5.0.0.1 3 0.000691 5.0.0.1 → 4.0.0.1 4 0.004688 4.0.0.1 → 5.0.0.1 5 0.008998 5.0.0.1 → 4.0.0.1 6 0.015992 4.0.0.1 → 5.0.0.1 7 0.015992 5.0.0.1 → 4.0.0.1 8 0.020998 5.0.0.1 → 4.0.0.1 9 0.021991 4.0.0.1 → 5.0.0.1 10 0.032998 5.0.0.1 → 4.0.0.1 11 0.044996 5.0.0.1 → 4.0.0.1 12 0.045995 4.0.0.1 → 5.0.0.1 13 0.056997 5.0.0.1 → 4.0.0.1 14 0.057996 4.0.0.1 → 5.0.0.1 15 0.068999 5.0.0.1 → 4.0.0.1 16 0.081001 5.0.0.1 → 4.0.0.1 17 0.081999 4.0.0.1 → 5.0.0.1 18 0.093002 5.0.0.1 → 4.0.0.1 19 0.094000 4.0.0.1 → 5.0.0.1 20 0.105004 5.0.0.1 → 4.0.0.1 1Gbps_FIRST_FLOW.pcap | head -20 0 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 74 41668 → 9907 [SYN] Seq=0 Win=8192 Len=0 MSS=1460 WS=16 SACK_PERM=1 TSval=92887655 TSecr=0 1 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=1 Ack=1 Win=8688 Len=0 TSval=92887659 TSecr=3338110980 0 Sandvine_19:f8:6b → Sandvine_19:f8:ea TCP 74 9907 → 41668 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1460 WS=16 SACK_PERM=1 TSval=3338110980 TSecr=92887655 1 Sandvine_19:f8:6b → Sandvine_19:f8:eb HTTP 608 GET /firefox?client=firefox-a&rls=org.mozilla:en-US:official HTTP/1.1 1 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 604 HTTP/1.1 302 Found (text/html) 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=1987 Win=7232 Len=0 TSval=92887740 TSecr=3338111005 539 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation 1987 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=3435 Win=8688 Len=0 TSval=92887764 TSecr=3338111005 3435 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 4883 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=6331 Win=7232 Len=0 TSval=92887884 TSecr=3338111064 6331 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=7779 Win=8688 Len=0 TSval=92887944 TSecr=3338111064 7779 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 9227 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=10675 Win=7232 Len=0 TSval=92888064 TSecr=3338111089 10675 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=12123 Win=8688 Len=0 TSval=92888124 TSecr=3338111209 12123 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation

mid-flow issues

$ tshark -r trex_sampling_single_host_1Gbps_FIRST_FLOW.pcap | head -180 | tail -20 161 1.125038 5.0.0.1 → 4.0.0.1 135203 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 162 1.125939 4.0.0.1 → 5.0.0.1 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=136651 Win=7232 Len=0 TSval=92893284 TSecr=3338116309 163 1.137040 5.0.0.1 → 4.0.0.1 136651 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 164 1.137939 4.0.0.1 → 5.0.0.1 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 41668 → 9907 [ACK] Seq=543 Ack=138099 Win=8688 Len=0 TSval=92893344 TSecr=3338116429 165 1.149943 5.0.0.1 → 4.0.0.1 138099 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 166 1.161945 4.0.0.1 → 5.0.0.1 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=140995 Win=7232 Len=0 TSval=92893464 TSecr=3338116489 167 1.161945 5.0.0.1 → 4.0.0.1 139547 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation 168 1.173945 4.0.0.1 → 5.0.0.1 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=142443 Win=8688 Len=0 TSval=92893524 TSecr=3338116490 169 1.173945 5.0.0.1 → 4.0.0.1 140995 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation 170 1.185947 5.0.0.1 → 4.0.0.1 142443 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 171 1.197949 4.0.0.1 → 5.0.0.1 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=145339 Win=7232 Len=0 TSval=92893644 TSecr=3338116669 172 1.197949 5.0.0.1 → 4.0.0.1 143891 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation 173 1.209950 4.0.0.1 → 5.0.0.1 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=146787 Win=8688 Len=0 TSval=92893704 TSecr=3338116669 174 1.209950 5.0.0.1 → 4.0.0.1 145339 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation 175 1.221952 5.0.0.1 → 4.0.0.1 146787 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation 176 1.233954 4.0.0.1 → 5.0.0.1 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=149683 Win=7232 Len=0 TSval=92893824 TSecr=3338116849 177 1.233954 5.0.0.1 → 4.0.0.1 148235 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation 178 1.245955 4.0.0.1 → 5.0.0.1 543 Sandvine_19:f8:6b → Sandvine_19:f8:eb TCP 66 [TCP ACKed unseen segment] 41668 → 9907 [ACK] Seq=543 Ack=151131 Win=8688 Len=0 TSval=92893885 TSecr=3338116850 179 1.245955 5.0.0.1 → 4.0.0.1 149683 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 [TCP Spurious Retransmission] Continuation 180 1.257956 5.0.0.1 → 4.0.0.1 151131 Sandvine_19:f8:6b → Sandvine_19:f8:ea HTTP 1514 Continuation

The pattern is reproducible.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/cisco-system-traffic-generator/trex-core/issues/143#issuecomment-424470151, or mute the thread https://github.com/notifications/unsubscribe-auth/AMbjvUC5l3zVRvKjIPdZ8yQnRVaLAax_ks5ueoMvgaJpZM4WKX0Z .

-- Hanoh Sent from my iPhone

=====================================	IO Statistics
Duration: 8.386 secs
Interval: 8.386 secs

Col 1: Frames and bytes
-----------------------------------
	1
Interval	Frames	Bytes
-----------------------------------
0.000 <> 8.386	1161	1077722

=====================================	IO Statistics
Duration: 8.385 secs
Interval: 8.385 secs

Col 1: Frames and bytes
-----------------------------------
	1
Interval	Frames	Bytes
-----------------------------------
0.000 <> 8.385	1161	1077722

mcallaghan-sandvine commented 6 years ago

@hhaim

This is expected in STF mode. Read carefully the notes in 4.1 STF manual.

, which notes? (link me) - I was unable to find any note about such caveats/limitations

disable pcap ipg

, are you suggesting that user-controlled inter-packet-delay in the YAML per flow template would be accurate to 10us?

or use offline tool to create RTT between c->s

, what does this mean? (I already did this, and saved to pcap accordingly)

or use ASTF

, ASTF won't scale the same as STF does

Our goal here is to ensure that TRex is sending pkts within a reasonable error margin of either:

the pcap's inter-packet-delay per flow
the YAML's defined inter-packet-delay (which currently TRex calls ipg/rtt due to #142)

hhaim commented 6 years ago

In basic usage, TRex does not wait for an initiator packet to be received. The response packet will be triggered based only on timeout (IPG in this example). In advanced scenarios (for example, NAT), The first packet of the flow can process by TRex software and initiate the response packet only when a packet is received. Consequently, it is necessary to processthe template pcap file offline and ensure that there is enough round-trip delay (RTT) between client and server packets. One approach is to record the flow with a Pagent that creats RTT (10 msec RTT in the example), recording the traffic at some distance from both the client and server (not close to either side). This ensures sufficient delay that packets from each side will arrive without delay in the DUT. TRex-dev will work on an offline tool that will make it even simpler. Another approach is to change the yaml ipg field to a high enough value (bigger than 10msec ).

On Tue, 25 Sep 2018 at 23:27 Matt Callaghan notifications@github.com wrote:

@hhaim https://github.com/hhaim

This is expected in STF mode. Read carefully the notes in 4.1 STF manual.

, which notes? (link me) - I was unable to find any note about such caveats/limitations

disable pcap ipg

, are you suggesting that user-controlled inter-packet-delay in the YAML per flow template would be accurate to 10us?

or use offline tool to create RTT between c->s

, what does this mean? (I already did this, and saved to pcap accordingly)

or use ASTF

, ASTF won't scale the same as STF does

Our goal here is to ensure that TRex is sending pkts within a reasonable error margin of either:

the pcap's inter-packet-delay per flow

the YAML's defined inter-packet-delay (which currently TRex calls ipg/rtt due to #142 https://github.com/cisco-system-traffic-generator/trex-core/issues/142 )

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/cisco-system-traffic-generator/trex-core/issues/143#issuecomment-424489515, or mute the thread https://github.com/notifications/unsubscribe-auth/AMbjvdFUfqgrvXRVarD8eKyaTmE6VLcIks5uepGvgaJpZM4WKX0Z .

-- Hanoh Sent from my iPhone

mcallaghan-sandvine commented 6 years ago

FYI: I just disproved idea RE YAML too --

This came from all the same as two posts ago, but with cap_ipg=false, and requesting 100us ipg/rtt:

$ cat one_flow.yaml
- duration : 9999
  generator :
          distribution : "seq"
          clients_start : "4.0.0.1"
          clients_end   : "4.0.0.99"
          servers_start : "5.0.0.1"
          servers_end   : "5.0.20.255"
  cap_ipg  : false
  cap_info :
     - name: http_get_1-1mbytes.pcap
       w   : 1
       cps : 10
       ipg : 100
       rtt : 100

see https://github.com/mcallaghan-sandvine/trex-core/blob/issue_143_temp_files/issue_143_temp_files/trex_sampling_single_host_1Gbps_100us_ipg_FIRST_FLOW.pcap

mcallaghan-sandvine commented 6 years ago

btw it's not realistic to have ipg/rtt >1ms in many scenarios (Certainly some real-world scenarios have >10ms latency - but for flows/streams that are heavily asymmetric with many pkts blasting down data payloads, there is nearly no inter-packet-delay for those)

mcallaghan-sandvine commented 6 years ago

PS: the quote Hanoh pasted is from: https://trex-tgn.cisco.com/trex/doc/trex_manual.html#_dns_basic_example

This is a step towards understanding, however it does not explicitly state there TRex has a limitation here.

If a captured flow's inherent inter-packet-delays are <10ms, since TRex does not wait for initiator/response in STF mode, it may in error send packets out of order (!) WARNING
The flow template (YAML) configuration has granularity support for down to 1us (yet TRex is susceptible to pkt order limitations as per above <10ms)

if this is simply the "reality of the STF world"

can it be fixed ever? (I'm definitely not familiar enough with TRex under-the-hood here to understand why it cannot just replay a flow CLIENT + SERVER side, based exactly upon the pcap's source down to 10us granularity ...)
if we won't ever fix it, then we need to be explicit and clear about these issues/limitations in documentation, AND, if TRex reads a pcap with inter-packet-delays <10ms, OR the user sets ipg/rtt as such, it should spew a huge warning because its going to mangle pkt flow output

hhaim commented 6 years ago

TRex play the pcap in the exact timing (~10usec) however because there are queues (TRex output queue and DUT input queue) and the queues could be with different length 10usec could create out of order.

What you should do to solve this is to use an offline tool that replay the pcap with RTT of more than 10msec.

In this way it will be realistic as burst from one side will be back to back but you will not have the out of order issue with different sides.

RTT of more than 10msec is normal in realistic use-cases.

ASTF simulator can be used as an offline tool but any other means can do the tricks. The only important thing that the pcap will be captured in the middle (between client and server)

Have a look into the pcap of SFR profile for a reference of pcap that we’re offline processed with RTT=10msec.

Thanks Hanoh On Tue, 25 Sep 2018 at 23:51 Matt Callaghan notifications@github.com wrote:

PS: the quote Hanoh pasted is from: https://trex-tgn.cisco.com/trex/doc/trex_manual.html#_dns_basic_example

This is a step towards understanding, however it does not explicitly state there TRex has a limitation here.

If a captured flow's inherent inter-packet-delays are <10ms, since TRex does not wait for initiator/response in STF mode, it may in error send packets out of order (!) WARNING

The flow template (YAML) configuration has granularity support for down to 1us (yet TRex is susceptible to pkt order limitations as per above <10ms)

if this is simply the "reality of the STF world"

can it be fixed ever? (I'm definitely not familiar enough with TRex under-the-hood here to understand why it cannot just replay a flow CLIENT + SERVER side, based exactly upon the pcap's source down to 10us granularity ...)

if we won't ever fix it, then we need to be explicit and clear about these issues/limitations in documentation, AND, if TRex reads a pcap with inter-packet-delays <10ms, OR the user sets ipg/rtt as such, it should spew a huge warning because its going to mangle pkt flow output

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/cisco-system-traffic-generator/trex-core/issues/143#issuecomment-424497232, or mute the thread https://github.com/notifications/unsubscribe-auth/AMbjvQd7ziNI30YNatJuB-pgEa9TjYRCks5uepdjgaJpZM4WKX0Z .

-- Hanoh Sent from my iPhone

mcallaghan-sandvine commented 6 years ago

@hhaim This is a bit frustrating and contradictory. When I initially reported this issue, you responded that TRex was accurate down to 10us at higher bitrates, (at the time I had no way to verify, so I tentatively accepted that reasoning). But now, after I showed results the same even at 1Gbps, now the tune has changed and we're saying that this is expected? (Am I wasting my time?)

We aren't looking to necessarily workaround the issue here (though we would if we have to). Ultimately I wanted to validate TRex accuracy at a range of the ipg/rtt values and using pcap=true. Setting all flows with hard ipg/rtt=10ms is not reasonable. It may be realistic to accept that rtt=10ms, but not ipg, especially when flows are heavily asymmetric ... delaying pkts in a single direction that do not have any response delays is not realistic.

We also do not want all of our pcap sources to have to have a static 10ms latency -- when we leverage real-world captures from the field, the inter-packet-delays may well be lower than 1ms. (they will be whatever they will be at time of capture - having to post-process them all would be undesirable). RTT MAY be ~10ms, but it would be preferable to have whatever variety of latency the real world capture has, and to not statically specify it to 10ms for all TCP RTT delays.

We need to be clear, that (if) TRex STF mode does not support less than 10ms granularity for inter-packet-delay (ipg/rtt settings in YAML) nor does it support reading pcap files that have less than 10ms granularity. (I'm saying "not support" because the intuitive expectation is that TRex will send the flow's packets in order, with pcap delays or specified ipg/rtt without packet timestamp issues or packet reorder issues).

Unless you're saying TRex core itself "does the right thing" and it is an issue rather with queues, perhaps we need to isolate and narrow down further. (still the end result is bad, needs to be root caused regardless of whether or not its in TRex core or TRex egress queue servicing issues)

I do not think that different length queues (TRex vs DUT) would explain or cause these observed issues. (Queue length differences should only cause dropped packets due to burst) - but certainly there are two other conditions that might cause this (read on)

Next Considerations

Does TRex ensure ALL packets from a specific flow are only ever put on the same egress queue? (this is very important to avoid out of order!) - i.e. consider -c X flag ... using multiple cores may inherently utilize multiple queues?
Can we read/monitor/tune the queues setup by TRex (dpdk) to determine their service frequency? (indeed if an egress queue on the TRex system is only serviced every 100us or 1ms, then certainly this would explain why there are micro bursts of packets with the exact same timestamp instead of the requested 10us or 100us inter-packet-delay)
Could also be DUT-side (which I am using to test these scenarios) - however when we our own internal tools these pkt reordering issues are not present, so I am doubtful that the issue lies here. None-the-less, thoughts are welcome on how I can validate this w/out a DUT, and test only TRex loopback ports. (how to capture on those? I may need to invest time learning how to do this on DPDK attached interfaces)

hhaim commented 6 years ago

TRex scheduler (in all modes) is accurate in 10usec range. In very low speed (1kpps) it flushs every 1msec. When using 2 interfaces with real world queues (DUT and TRex) there is no way to guaranty order between the 2 interfaces queues. TRex can guaranty only order in one direction (for example UDP directional RTP flow) You must have RTT higher than your queue size/latency this is how STF works. 10msec RTT is recommended.

ASTF can work with lower RTT and simulate higher RTT in real time (without offline tool) because it has TCP stack and feedback.

The above is exaplined in the manual.

See inline for your other questions

Thanks, Hanoh

On Wed, 26 Sep 2018 at 15:01 Matt Callaghan notifications@github.com wrote:

@hhaim https://github.com/hhaim This is a bit frustrating and contradictory. When I initially reported this issue, you responded that TRex was accurate down to 10us at higher bitrates, (at the time I had no way to verify, so I tentatively accepted that reasoning). But now, after I showed results the same even at 1Gbps, now the tune has changed and we're saying that this is expected? (Am I wasting my time?)

We aren't looking to necessarily workaround the issue here (though we would if we have to). Ultimately I wanted to validate TRex accuracy at a range of the ipg/rtt values and using pcap=true. Setting all flows with hard ipg/rtt=10ms is not reasonable. It may be realistic to accept that rtt=10ms, but not ipg, especially when flows are heavily asymmetric ... delaying pkts in a single direction that do not have any response delays is not realistic.

We also do not want all of our pcap sources to have to have a static 10ms latency -- when we leverage real-world captures from the field, the inter-packet-delays may well be lower than 1ms. (they will be whatever they will be at time of capture - having to post-process them all would be undesirable)

[hh] not possible with STF

We need to be clear, that (if) TRex STF mode does not support less than 10ms granularity for inter-packet-delay (ipg/rtt settings in YAML) nor does it support reading pcap files that have less than 10ms granularity. (I'm saying "not support" because the intuitive expectation is that TRex will send the flow's packets in order, with pcap delays or specified ipg/rtt without packet timestamp issues or packet reorder issues).

[hh] TRex send the packets in order the DUT queue will mess thing out in higher speed beacuse of the latency. STF does not have feedback while ASTF has.

Unless you're saying TRex core itself "does the right thing" and it is an issue rather with queues, perhaps we need to isolate and narrow down further. (still the end result is bad, needs to be root caused regardless of whether or not its in TRex core or TRex egress queue servicing issues)

[hh] correct. You can verify this using -p. All the flow packets will be sent from one queue. No out of order.

I do not think that different length queues (TRex vs DUT) would explain or cause these observed issues. (Queue length differences should only cause dropped packets due to burst) - but certainly there are two other conditions that might cause this (read on)

Next Considerations

1.

Does TRex ensure ALL packets from a specific flow are only ever put on the same egress queue? (this is very important to avoid out of order!) - i.e. consider -c X flag ... using multiple cores may inherently utilize multiple queues?

[hh] not relevant. Every flow direction will be put in order. The problem is the order between two sides.

Can we read/monitor/tune the queues setup by TRex (dpdk) to determine their service frequency? (indeed if an egress queue on the TRex system is only serviced every 100us or 1ms, then certainly this would explain why there are micro bursts of packets with the exact same timestamp instead of the requested 10us or 100us inter-packet-delay) [hh] the latency stream can verify this

1.

Could also be DUT-side (which I am using to test these scenarios) - however when we our own internal tools these pkt reordering issues are not present, so I am doubtful that the issue lies here. None-the-less, thoughts are welcome on how I can validate this w/out a DUT, and test only TRex loopback ports. (how to capture on those? I may need to invest time learning how to do this on DPDK attached interfaces)

[hh] this is beacuse the tool has feedback like ASTF. With STF there is no feedback. I assume ASTF is much faster than your local internal tools.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/cisco-system-traffic-generator/trex-core/issues/143#issuecomment-424689554, or mute the thread https://github.com/notifications/unsubscribe-auth/AMbjvQ-CyeP1YOIfg3w97_NaiCcSFLXaks5ue2yTgaJpZM4WKX0Z .

-- Hanoh Sent from my iPhone

mcallaghan-sandvine commented 6 years ago

Thanks Hanoh! Looks like I have a bit more work then ahead of me to try to isolate further to verify our assumptions and current hypothesis'.

mcallaghan-sandvine commented 6 years ago

PS: just so that I am clear, to ellaborate on my concern of "real world inaccuracies"

having an RTT >1m, >10ms, etc is completely normal in the real world
having an inter-packet-delay (aka in TRex ipg) >10ms for ALL packets is not realistic

consider, from a real-world netflix capture,

18k packets
~45 seconds duration of flow (stream)
~21MB capture size
only 500 packets exceed 1ms time delta
3.8k packets (20%) are between 100us and 1ms
13k packets (75%) are <100us

when we replay w/ a 10ms forced delay (to avoid TRex fubbing up the timestamps and not sending out of order packets) - we introduce a significantly longer streaming flow (which probably is completely invalid ... in the real world, packets coming in that slow would buffer the video all to hell)

when I replay that 45sec netflix flow, w/ ipg=10ms, TRex does:

18k packets (yay!)
220 seconds duration of flow *** :exclamation: (boo!)
~21MB (yay!)
100% of pkts inter-packet-delay=10ms (as-per-workaround/limitation)

hhaim commented 6 years ago

This is correct. TRex STF when replaying pcap with two sides requires:

Pcap RTT will be bigger than 10msec (latency of the DUT).
The capture location should be in the middle (Time between SYN to SYN-ACK at least 10msec, time between SYN-ACK to ACK 10msec)

It is not required to have IPG bigger than 10msec.

For #1,#2 it is required to offline process the pcap file.

Hanoh

On Fri, 28 Sep 2018 at 23:37 Matt Callaghan notifications@github.com wrote:

PS: just so that I am clear, to ellaborate on my concern of "real world inaccuracies"

having an RTT >1m, >10ms, etc is completely normal in the real world

having an inter-packet-delay (aka in TRex ipg) >10ms for ALL packets is not realistic

consider, from a real-world netflix capture,

18k packets

~45 seconds duration of flow (stream)

~21MB capture size

only 500 packets exceed 1ms time delta

3.8k packets (20%) are between 100us and 1ms

13k packets (75%) are <100us

when we replay w/ a 10ms forced delay (to avoid TRex fubbing up the timestamps and not sending out of order packets) - we introduce a significantly longer streaming flow (which probably is completely invalid ... in the real world, packets coming in that slow would buffer the video all to hell)

when I replay that 45sec netflix flow, w/ ipg=10ms, TRex does:

18k packets (yay!)

220 seconds duration of flow *** ❗️ (boo!)

~21MB (yay!)

100% of pkts inter-packet-delay=10ms (as-per-workaround/limitation)

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/cisco-system-traffic-generator/trex-core/issues/143#issuecomment-425558869, or mute the thread https://github.com/notifications/unsubscribe-auth/AMbjve0EUYrdTqnsgpHqYEQ9tH9Sj7Mgks5ufoihgaJpZM4WKX0Z .

-- Hanoh Sent from my iPhone

hhaim commented 6 years ago

please close this if there is no other questions regarding this

mcallaghan-sandvine commented 6 years ago

We will be working internally to try to understand the queue min-latency of the system, and if there is a way to reduce it (service it more frequently).

Once we understand, then perhaps we can decide how to move forward. This is certainly a major gap in documentation which (at minimum) needs to be fixed. Users should not be setting ipg/rtt to <10ms, OR if using ipg from pcap, they need to ensure post-processing guarantees capture has >10ms ipg/rtt else suffer out of order packets and +/-0ms timestamp between directions.

(it's really a huge issue here to understand)

hhaim commented 6 years ago

This is foundemental to how STF works, if the manual is not clear about that you can come up with a pull-request

thanks, Hanoh

On Tue, Oct 2, 2018 at 3:57 PM Matt Callaghan notifications@github.com wrote:

We will be working internally to try to understand the queue min-latency of the system, and if there is a way to reduce it (service it more frequently).

Once we understand, then perhaps we can decide how to move forward. This is certainly a major gap in documentation which (at minimum) needs to be fixed. Users should not be setting ipg/rtt to <10ms, OR if using ipg from pcap, they need to ensure post-processing guarantees capture has >10ms ipg/rtt else suffer out of order packets and +/-0ms timestamp between directions.

(it's really a huge issue here to understand)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cisco-system-traffic-generator/trex-core/issues/143#issuecomment-426263114, or mute the thread https://github.com/notifications/unsubscribe-auth/AMbjvcrDDR-oVnhZxrTe3-8qulNIAggpks5ug2K4gaJpZM4WKX0Z .

-- Hanoh Sent from my iPhone