edgecomllc / eupf

5G User Plane Function (UPF) based on eBPF
Apache License 2.0
100 stars 20 forks source link

tests: run multiple simultaneous tcpreplay tests #450

Closed kade-ddnkv closed 11 months ago

kade-ddnkv commented 11 months ago

For now, I am using TCPREPLAY_LIMIT=70000 In case I make it more (like 700000), test can (not always) somewhy drop with such error message:

Perform load test                                                     ...Second signal will force exit.
Second signal will force exit.
.[ ERROR ] tcpreplay: execution failed (type)                                 
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/scapy/sendrecv.py", line 549, in sendpfast
    stdout, stderr = cmd.communicate()
                     ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/subprocess.py", line 1209, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/subprocess.py", line 2108, in _communicate
    ready = selector.select(timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/robot/running/signalhandler.py", line 40, in __call__
    self._stop_execution_gracefully()
  File "/usr/local/lib/python3.12/site-packages/robot/running/signalhandler.py", line 43, in _stop_execution_gracefully
    raise ExecutionFailed('Execution terminated by signal', exit=True)
robot.errors.ExecutionFailed: Execution terminated by signal
[ ERROR ] tcpreplay: execution failed (type)
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/scapy/sendrecv.py", line 549, in sendpfast
    stdout, stderr = cmd.communicate()
                     ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/subprocess.py", line 1209, in communicate
    stdout, stderr = self._communicate(input, endtime, timeout)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/subprocess.py", line 2108, in _communicate
    ready = selector.select(timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/selectors.py", line 415, in select
    fd_event_list = self._selector.poll(timeout)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/robot/running/signalhandler.py", line 40, in __call__
    self._stop_execution_gracefully()
  File "/usr/local/lib/python3.12/site-packages/robot/running/signalhandler.py", line 43, in _stop_execution_gracefully
    raise ExecutionFailed('Execution terminated by signal', exit=True)
robot.errors.ExecutionFailed: Execution terminated by signal
Perform load test                                                     | FAIL | load test                                                     .
Process 2 Failed
------------------------------------------------------------------------------
Loadtest                                                              | FAIL |
1 test, 0 passed, 1 failed
==============================================================================
kade-ddnkv commented 11 months ago

An example of test logs. Without threads:

Perform load test                                                     
.Resulting pps: 166255.77
.Resulting mbps: 1465.71
.Resulting packets: 70000

With 2 threads:

Perform load test                                                     
.Resulting pps: 288043.65
.Resulting mbps: 2539.38
.Resulting packets: 140000

With 6 threads:

Perform load test                                                     
.Resulting pps: 375539.99
.Resulting mbps: 3310.7400000000002
.Resulting packets: 420000
kade-ddnkv commented 11 months ago

For now, I am using TCPREPLAY_LIMIT=70000 In case I make it more (like 700000), test can (not always) somewhy drop with such error message:

That error is appearing because of \<timeout> parameter of Wait Async All method. It sets an upper bound of how much seconds can each thread live. If the thread fails to finish at the time allotted to him, robot test fails with error "Process N Failed". So \<timeout> should be set with computational power of computer in mind.

kade-ddnkv commented 11 months ago

An evidence of working processor cores: image