nasa / fprime

F´ - A flight software and embedded systems framework
https://fprime.jpl.nasa.gov
Apache License 2.0
9.95k stars 1.28k forks source link

Failures of TCP/UDP unit tests #2706

Open JohanBertrand opened 2 months ago

JohanBertrand commented 2 months ago
F´ Version 6bb780d
Affected Component

Problem Description

Similar to https://github.com/nasa/fprime/issues/1290, https://github.com/nasa/fprime/issues/881

Maybe also related to https://github.com/nasa/fprime/issues/2244

I am not sure if it is having an impact outside of the UT.

The tests of Drv/TcpServer and Drv/TcpClient are often failing, about 25% of the time (excluding silent failures).

It comes from several different errors:

[ RUN ] Nominal.BasicReceiveThread [WARNING] Failed to open port with status -9 and errno 98 [WARNING] Failed to open port with status -9 and errno 98 /home/fprime/Drv/TcpServer/test/ut/TcpServerTester.cpp:41: Failure Value of: Drv::Test::wait_on_started(this->component.getSocketHandler(), true, SOCKET_RETRY_INTERVAL_MS/10 + 1) Actual: false Expected: true /home/fprime/Drv/TcpServer/test/ut/TcpServerTester.cpp:51: Failure Value of: component.getSocketHandler().isStarted() Actual: false Expected: true [WARNING] Failed to open port with status -9 and errno 98 /home/fprime/Drv/TcpServer/test/ut/TcpServerTester.cpp:65: Failure Value of: Drv::Test::wait_on_change(this->component.getSocketHandler(), true, SOCKET_RETRY_INTERVAL_MS/10 + 1) Actual: false Expected: true /home/fprime/Drv/TcpServer/test/ut/TcpServerTester.cpp:67: Failure Value of: this->component.getSocketHandler().isOpened() Actual: false Expected: true /home/fprime/Drv/TcpServer/test/ut/TcpServerTester.cpp:70: Failure Expected equality of these values: status2 Which is: -4 Drv::SOCK_SUCCESS Which is: 0 /home/fprime/build-fprime-automatic-native-ut/F-Prime/Drv/TcpServer/TcpServerGTestBase.cpp:87: Failure Expected equality of these values: size Which is: 1 this->fromPortHistorySize_ready Which is: 0

/home/fprime/Drv/TcpServer/test/ut/TcpServerTester.cpp:105 Value: Size of history for ready Expected: 1 Actual: 0

[ FAILED ] Nominal.BasicReceiveThread (3001 ms) [----------] 2 tests from Nominal (3002 ms total)

- Sometimes, in TCP and UDP UT, the UT fail silently

TcpClient: [ RUN ] Reconnect.ReceiveThreadReconnect Listening for single client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 [WARNING] Failed to recv from port with status -8 and errno 9 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 Connected to 127.0.0.1:57291 as a tcp client Accepted client at 127.0.0.1:57291 [ OK ] Reconnect.ReceiveThreadReconnect (103 ms)

TcpServer: [ RUN ] Nominal.BasicReceiveThread Listening for single client at 127.0.0.1:44010 Connected to 127.0.0.1:44010 as a tcp client Accepted client at 127.0.0.1:44010 [WARNING] Failed to recv from port with status -8 and errno 0 [ OK ] Nominal.BasicReceiveThread (20 ms)

TcpServer: [ RUN ] Nominal.BasicReceiveThread Listening for single client at 127.0.0.1:40343 Connected to 127.0.0.1:40343 as a tcp client Accepted client at 127.0.0.1:40343 [WARNING] Failed to open port with status -14 and errno 0 [ OK ] Nominal.BasicReceiveThread (1021 ms)

TcpServer: [ RUN ] Reconnect.ReceiveThreadReconnect Listening for single client at 127.0.0.1:63959 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 Connected to 127.0.0.1:63959 as a tcp client Accepted client at 127.0.0.1:63959 [WARNING] Failed to recv from port with status -8 and errno 0 [ OK ] Reconnect.ReceiveThreadReconnect (113 ms)

Udp: [ RUN ] Reconnect.ReceiveThreadReconnect Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 [WARNING] Failed to recv from port with status -8 and errno 0 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 Setup to send and receive udp to 127.0.0.1:60059 Setup to send and receive udp to 127.0.0.1:37290 [ OK ] Reconnect.ReceiveThreadReconnect (103 ms)

Udp: [ RUN ] Reconnect.ReceiveThreadReconnect Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 [WARNING] Failed to recv from port with status -8 and errno 9 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 Setup to send and receive udp to 127.0.0.1:24765 Setup to send and receive udp to 127.0.0.1:53978 [ OK ] Reconnect.ReceiveThreadReconnect (102 ms)


## Context / Environment

Execute `fprime-util version-check` and share the output.

Operating System: Linux CPU Architecture: x86_64 Platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.29 Python version: 3.8.10 CMake version: 3.28.1 Pip version: 20.0.2 Pip packages: fprime-tools==3.4.4 fprime-gds==3.4.3 fprime-fpp-*==2.1.0a9


Running fprime on `ubuntu:20.04` docker image.

Same behavior identified on a `ubuntu:22.04` docker image.

## How to Reproduce

1. Go to fprime root
2. Execute `fprime generate --ut`
3. Execute `fprime build --ut -j16`
4. Save the following python script:
```python
import subprocess
import shutil

FPRIME_PATH="/home/fprime"
LOG_PATH="/home/log_tcp_udp_error/"
ITERATIONS=100

calls = [
    FPRIME_PATH + "/build-fprime-automatic-native-ut/bin/Linux/Drv_TcpClient_ut_exe",
    FPRIME_PATH + "/build-fprime-automatic-native-ut/bin/Linux/Drv_TcpServer_ut_exe",
    FPRIME_PATH + "/build-fprime-automatic-native-ut/bin/Linux/Drv_Udp_ut_exe",
]

for call in calls:
    executable = call.split('/')[-1]
    print(executable)
    for ii in range(0, ITERATIONS):
        print(f"Test number {ii:04}")
        try:
            output = subprocess.check_output([call], cwd=FPRIME_PATH, stderr=subprocess.PIPE)
            string_output = output.decode()
            if "[WARNING] Failed" in string_output:
                print("Silent failure")
                f = open(LOG_PATH + f"silent_{executable}_{ii:04}.log", "x")
                f.write(string_output)
                f.close()
        except subprocess.CalledProcessError as grepexc:
            print("Error code:", grepexc.returncode)
            f = open(LOG_PATH + f"{executable}_{ii:04}.log", "x")
            f.write(grepexc.output.decode())
            f.close()
  1. Tune the values FPRIME_PATH to the current fprime folder and LOG_PATH to an empty folder to save log of unit tests. ITERATIONS can also be changed to increase the amount of run of the unit tests
  2. Execute the python script
  3. Check the log in the LOG_PATH folder

Expected Behavior

No error should be generated. The unit tests should also fail on warning

JohanBertrand commented 1 month ago

https://github.com/nasa/fprime/pull/2739 fixed the non-silent errors. The silent warnings are still here