Closed Gene-Lo closed 4 years ago
Looks likeUser configuration error due to below error messages
2020-09-20 12:29:12,685 process L0604 INFO | Running 'timeout 2m rping -c -C300 -a fe80::526b:4b03:f:daa8%enP1p9s0f0 -v' 2020-09-20 12:29:14,732 process L0416 DEBUG| [stderr] cma event RDMA_CM_EVENT_ADDR_ERROR, error -110
ipv6 address need to be configured on both the test machines. Without which this test will fail. ping6 is one way to ensure it is working before using rping
Please ensure the ipv6 is configured and if it is still failing please share the manual runs so that it is easier to compare if it is really a automation script issue, user configuration error or an real issue.
After run rping.py in Rhel8.2 with Avocado(version: 82.0), it shows " FAIL: Client command failed "
【Test Step】 Step 1. Prepare 2 terminals while each one is equipped with 1 Network-Card. Then connect the 2 Network-Cards with each other. ※Network-Card: Mellanox_2-PORT 10Gb NIC&ROCE ConnectX-4Lx SR/Cu PCIe 3.0 LP CAPABLE ADAPTER
Step 2. Edit yaml file: /root/tests/tests/avocado-misc-tests/io/net/infiniband/rping.py.data/rping_infiniband.yaml
Step 3. Run rdma_tests.py via cmd: avocado run rping.py -m rping.py.data/rping_infiniband.yaml
【Test log】: job log & Manual-Test-log Test log.zip
【Section of job.log】 《INIT 5-rping.py:Rping.test》 2020-09-20 12:29:12,685 process L0604 INFO | Running 'timeout 2m rping -c -C300 -a fe80::526b:4b03:f:daa8%enP1p9s0f0 -v' 2020-09-20 12:29:14,732 process L0416 DEBUG| [stderr] cma event RDMA_CM_EVENT_ADDR_ERROR, error -110 2020-09-20 12:29:14,735 process L0686 INFO | Command 'timeout 2m rping -c -C300 -a fe80::526b:4b03:f:daa8%enP1p9s0f0 -v' finished with 255 after 2.0463709831237793s 2020-09-20 12:29:14,735 stacktrace L0039 ERROR| 2020-09-20 12:29:14,735 stacktrace L0042 ERROR| Reproduced traceback from: /usr/local/lib/python3.6/site-packages/avocado_framework-82.0-py3.6.egg/avocado/core/test.py:767 2020-09-20 12:29:14,737 stacktrace L0045 ERROR| Traceback (most recent call last): 2020-09-20 12:29:14,737 stacktrace L0045 ERROR| File "/root/tests/tests/avocado-misc-tests/io/net/infiniband/rping.py", line 155, in test 2020-09-20 12:29:14,737 stacktrace L0045 ERROR| self.fail("Client command failed") 2020-09-20 12:29:14,737 stacktrace L0045 ERROR| File "/usr/local/lib/python3.6/site-packages/avocado_framework-82.0-py3.6.egg/avocado/core/test.py", line 953, in fail 2020-09-20 12:29:14,737 stacktrace L0045 ERROR| raise exceptions.TestFail(message) 2020-09-20 12:29:14,737 stacktrace L0045 ERROR| avocado.core.exceptions.TestFail: Client command failed 2020-09-20 12:29:14,737 stacktrace L0046 ERROR| 2020-09-20 12:29:14,737 test L0772 DEBUG| Local variables: 2020-09-20 12:29:14,772 test L0775 DEBUG| -> output <class 'avocado.utils.process.CmdResult'>: command: "/usr/bin/ssh -o 'StrictHostKeyChecking=no' -o 'UpdateHostKeys=no' -o 'ControlPath=~/.ssh/avocado-master-%r@%h:%p' -l root -q 192.168.10.2 'timeout 2m rping -s -a ::0 > /tmp/ib_log 2>&1 &'" exit_status: 0 duration: 0.03451800346374512 interrupted: False pid: 1267535 encoding: 'UTF-8' stdout: b'' stderr: b'' 2020-09-20 12:29:14,773 test L0775 DEBUG| -> cmd <class 'str'>: timeout 2m rping -c -C300 -a fe80::526b:4b03:f:daa8%enP1p9s0f0 -v 2020-09-20 12:29:14,773 test L0775 DEBUG| -> logs <class 'str'>: > /tmp/ib_log 2>&1 & 2020-09-20 12:29:14,773 test L0775 DEBUG| -> self <class 'rping.Rping'>: 5-rping.py:Rping.test;run-Parameters-Test-options-third-mtu-1500-791e
《INIT 6-rping.py:Rping.test》 2020-09-20 12:32:38,104 process L0604 INFO | Running 'timeout 2m rping -c -C300 -a fe80::526b:4b03:f:daa8%enP1p9s0f0 -v' 2020-09-20 12:32:40,171 process L0416 DEBUG| [stderr] cma event RDMA_CM_EVENT_ADDR_ERROR, error -110 2020-09-20 12:32:40,173 process L0686 INFO | Command 'timeout 2m rping -c -C300 -a fe80::526b:4b03:f:daa8%enP1p9s0f0 -v' finished with 255 after 2.065110921859741s 2020-09-20 12:32:40,176 stacktrace L0039 ERROR| 2020-09-20 12:32:40,176 stacktrace L0042 ERROR| Reproduced traceback from: /usr/local/lib/python3.6/site-packages/avocado_framework-82.0-py3.6.egg/avocado/core/test.py:767 2020-09-20 12:32:40,177 stacktrace L0045 ERROR| Traceback (most recent call last): 2020-09-20 12:32:40,177 stacktrace L0045 ERROR| File "/root/tests/tests/avocado-misc-tests/io/net/infiniband/rping.py", line 155, in test 2020-09-20 12:32:40,177 stacktrace L0045 ERROR| self.fail("Client command failed") 2020-09-20 12:32:40,177 stacktrace L0045 ERROR| File "/usr/local/lib/python3.6/site-packages/avocado_framework-82.0-py3.6.egg/avocado/core/test.py", line 953, in fail 2020-09-20 12:32:40,177 stacktrace L0045 ERROR| raise exceptions.TestFail(message) 2020-09-20 12:32:40,177 stacktrace L0045 ERROR| avocado.core.exceptions.TestFail: Client command failed 2020-09-20 12:32:40,177 stacktrace L0046 ERROR| 2020-09-20 12:32:40,177 test L0772 DEBUG| Local variables: 2020-09-20 12:32:40,212 test L0775 DEBUG| -> output <class 'avocado.utils.process.CmdResult'>: command: "/usr/bin/ssh -o 'StrictHostKeyChecking=no' -o 'UpdateHostKeys=no' -o 'ControlPath=~/.ssh/avocado-master-%r@%h:%p' -l root -q 192.168.10.2 'timeout 2m rping -s -a ::0 > /tmp/ib_log 2>&1 &'" exit_status: 0 duration: 16.007258415222168 interrupted: False pid: 1276281 encoding: 'UTF-8' stdout: b'' stderr: b'' 2020-09-20 12:32:40,213 test L0775 DEBUG| -> cmd <class 'str'>: timeout 2m rping -c -C300 -a fe80::526b:4b03:f:daa8%enP1p9s0f0 -v 2020-09-20 12:32:40,213 test L0775 DEBUG| -> logs <class 'str'>: > /tmp/ib_log 2>&1 & 2020-09-20 12:32:40,213 test L0775 DEBUG| -> self <class 'rping.Rping'>: 6-rping.py:Rping.test;run-Parameters-Test-options-third-mtu-2000-1659
【Configuration】 《SUT6》 [Rhel8.2 Kernel] 4.18.0-193.14.3.el8_2.ppc64le
[FW config] BMC: op940.00.mih-5-0-g86f9791c2 PNOR: OP9-v2.4-4.37-prod
[HW config] CPU DD2.3 20core 2 Micron (MTA18ASF2G72PZ-2G9E1) 16G 16 Samsung PM985 960GB 1 PSU ACBEL 2000w 2 Slot1: Network2 - Mellanox 2-PORT EDR 100Gb IB CONNECTX-5 GEN4 PCIe x16 CAPI CAPABLE LP ADAPTER Slot2: Network7 - Marvell 2-PORT E'NET (2X10 10Gb), PCIe Gen 2 X8/SHORT LP CAPABLE (SHINER 10GBase-T) Slot3: Network5 - Mellanox 2-PORT 10Gb NIC&ROCE ConnectX-4Lx SR/Cu PCIe 3.0 LP CAPABLE ADAPTER Slot4: Network6 - Mellanox 2-PORT 25/10Gb NIC&ROCE SR/Cu PCIe 3.0 (25/10Gb EVERGLADES EN) Slot5: Network10 - Broadcom 5719 QP 1G (1G/100M/10M) Network Interface Card PCIe x4 LP Slot6: Network3 - Mellanox 2-PORT 100Gb ROCE EN CONNECTX-5 GEN4 PCIe x16 LP CAPABLE ADAPTER Slot7: Network9 - Marvell QUAD E'NET (2X1 + 2X10 10Gb), PCIe Gen 2 X8/SHORT LP CAPABLE (SHINER SFP+ SR COPPER)
《SUT8》 [Rhel8.2 Kernel] 4.18.0-193.19.1.el8_2.ppc64le
[FW config] BMC: op940.00.mih-5-0-g86f9791c2 PNOR: OP9-v2.4-4.37-prod
[HW config] CPU DD2.3 12core 2 Micron (MTA18ASF2G72PZ-2G9E1) 16G 16 Samsung PM985 960GB 1 PSU ACBEL 2000w 2 Slot1: Network2 - Mellanox 2-PORT EDR 100Gb IB CONNECTX-5 GEN4 PCIe x16 CAPI CAPABLE LP ADAPTER Slot2: Network7 - Marvell 2-PORT E'NET (2X10 10Gb), PCIe Gen 2 X8/SHORT LP CAPABLE (SHINER 10GBase-T) Slot3: Network5 - Mellanox 2-PORT 10Gb NIC&ROCE ConnectX-4Lx SR/Cu PCIe 3.0 LP CAPABLE ADAPTER Slot4: Network6 - Mellanox 2-PORT 25/10Gb NIC&ROCE SR/Cu PCIe 3.0 (25/10Gb EVERGLADES EN) Slot5: Network10 - Broadcom 5719 QP 1G (1G/100M/10M) Network Interface Card PCIe x4 LP Slot6: Network3 - Mellanox 2-PORT 100Gb ROCE EN CONNECTX-5 GEN4 PCIe x16 LP CAPABLE ADAPTER Slot7: Network9 - Marvell QUAD E'NET (2X1 + 2X10 10Gb), PCIe Gen 2 X8/SHORT LP CAPABLE (SHINER SFP+ SR COPPER)