zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.87k stars 6.62k forks source link

tests/bluetooth/bsim/host/att/eatt/tests_scripts/reconfigure.sh is too sensitive to random seed #55912

Closed aescolar closed 1 year ago

aescolar commented 1 year ago

Describe the bug tests/bluetooth/bsim/host/att/eatt/tests_scripts/reconfigure.sh is too sensitive to the simulated device random seed. And fails/passes depending on either the random seeds or how many random draws the simulated device does.

Even though the test passes right now in CI, other changes in completely unrelated places in the code can and will cause the test to fail, causing very confused developers.

To Reproduce Steps to reproduce the behavior:

  1. Apply this patch (this random seed causes a failure)

    diff --git a/tests/bluetooth/bsim/host/att/eatt/tests_scripts/reconfigure.sh b/tests/bluetooth/bsim/host/att/eatt/tests_scripts/reconfigure.sh
    index c927000c04..e8be043b4b 100755
    --- a/tests/bluetooth/bsim/host/att/eatt/tests_scripts/reconfigure.sh
    +++ b/tests/bluetooth/bsim/host/att/eatt/tests_scripts/reconfigure.sh
    @@ -23,10 +23,10 @@ BOARD="${BOARD:-nrf52_bsim}"
    cd ${BSIM_OUT_PATH}/bin
    
    Execute ./bs_${BOARD}_tests_bluetooth_bsim_host_att_eatt_prj_autoconnect_conf \
    -  -v=${verbosity_level} -s=${simulation_id} -d=0 -testid=central_reconfigure
    +  -v=${verbosity_level} -s=${simulation_id} -d=0 -testid=central_reconfigure -rs=18
    
    Execute ./bs_${BOARD}_tests_bluetooth_bsim_host_att_eatt_prj_autoconnect_conf \
    -  -v=${verbosity_level} -s=${simulation_id} -d=1 -testid=peripheral_reconfigure
    +  -v=${verbosity_level} -s=${simulation_id} -d=1 -testid=peripheral_reconfigure -rs=13
    
    Execute ./bs_2G4_phy_v1 -v=${verbosity_level} -s=${simulation_id} \
    -D=2 -sim_length=60e6 $@
  2. WORK_DIR=${ZEPHYR_BASE}/bsim_out tests/bluetooth/bsim/host/compile.sh
  3. tests/bluetooth/bsim/host/att/eatt/tests_scripts/reconfigure.sh
  4. See test failure

Expected behavior The testcase to not fail depending on unrelated random value changes.

Impact Failures in CI when changing unrelated code which use random numbers generation. I hit this issue when doing fully unrelated changes in another area of the code.

Logs and console output

d_01: @00:00:00.000000  *** Booting Zephyr OS build v3.3.0-rc3-1321-g64143fc88153 ***
d_00: @00:00:00.000000  *** Booting Zephyr OS build v3.3.0-rc3-1321-g64143fc88153 ***
d_00: @00:00:00.000000  [00:00:00.000,000] <inf> bt_hci_core: hci_vs_init: HW Platform: Nordic Semiconductor (0x0002)
d_00: @00:00:00.000000  [00:00:00.000,000] <inf> bt_hci_core: hci_vs_init: HW Variant: nRF52x (0x0002)
d_00: @00:00:00.000000  [00:00:00.000,000] <inf> bt_hci_core: hci_vs_init: Firmware: Standard Bluetooth controller (0x00) Version 3.3 Build 99
d_00: @00:00:00.000000  [00:00:00.000,000] <wrn> bt_id: bt_read_static_addr: No static addresses stored in controller
d_01: @00:00:00.000000  [00:00:00.000,000] <inf> bt_hci_core: hci_vs_init: HW Platform: Nordic Semiconductor (0x0002)
d_01: @00:00:00.000000  [00:00:00.000,000] <inf> bt_hci_core: hci_vs_init: HW Variant: nRF52x (0x0002)
d_01: @00:00:00.000000  [00:00:00.000,000] <inf> bt_hci_core: hci_vs_init: Firmware: Standard Bluetooth controller (0x00) Version 3.3 Build 99
d_01: @00:00:00.000000  [00:00:00.000,000] <wrn> bt_id: bt_read_static_addr: No static addresses stored in controller
d_01: @00:00:00.002648  [00:00:00.002,624] <inf> bt_hci_core: bt_dev_show_info: Identity: ED:32:B0:00:F5:5B (random)
d_01: @00:00:00.002648  [00:00:00.002,624] <inf> bt_hci_core: bt_dev_show_info: HCI: version 5.4 (0x0d) revision 0x0000, manufacturer 0x05f1
d_01: @00:00:00.002648  [00:00:00.002,624] <inf> bt_hci_core: bt_dev_show_info: LMP: version 5.4 (0x0d) subver 0xffff
d_00: @00:00:00.002648  [00:00:00.002,624] <inf> bt_hci_core: bt_dev_show_info: Identity: C4:93:6D:0A:7A:AF (random)
d_00: @00:00:00.002648  [00:00:00.002,624] <inf> bt_hci_core: bt_dev_show_info: HCI: version 5.4 (0x0d) revision 0x0000, manufacturer 0x05f1
d_00: @00:00:00.002648  [00:00:00.002,624] <inf> bt_hci_core: bt_dev_show_info: LMP: version 5.4 (0x0d) subver 0xffff
d_00: @00:00:00.213495  Device connected
d_01: @00:00:00.324100  Connected: C4:93:6D:0A:7A:AF (random)
d_00: @00:00:00.324097  Connected: ED:32:B0:00:F5:5B (random)
d_00: @00:01:00.000000 ERROR: (ZEPHYR_BASE/tests/bluetooth/bsim/host/att/eatt/src/common.c:102): Test eatt finished.
d_00: @00:01:00.000000  main: The TESTCASE FAILED with return code 2
d_01: @00:00:59.975577  main: The TESTCASE FAILED with return code 1

Environment (please complete the following information):

aescolar commented 1 year ago

CC @jori-nordic

jori-nordic commented 1 year ago

@hermabe could you check it out? from a quick first pass, it seems like the problem is bt_eatt_reconfigure getting called at the same time as the collision mitigation firing on the peripheral.

edit: the sequence of events that I can see is:

aescolar commented 1 year ago

@jori-nordic has taken it