Open disgoel opened 6 months ago
What is the kernel version that triggered this failure (uname -a
)?
What is the kernel version that triggered this failure (
uname -a
)?
uname -r 5.14.0-402.el9.ppc64le
That's not the version number of an upstream kernel. Please report issues encountered with Red Hat kernels to Red Hat.
It's interesting to see the blktests results on PowerPC architecture :)
Unfortunately, the kernel version 5.14.0 that RedHat chose for RHEL9 is not LTS kernel, so it will not be productive to put debug effort with the kernel. I'm interested in if the failure is still observed with the latest v6.7 kernel (or LTS kernels) on the Power PC system. @disgoel , is it possible to build v6.7 kernel and install on the system? If srp tests still fails, then there are unknown issues. If srp tests passes, then the old kernel is the issue.
I tried the kernel v5.14.21 on Fedora 39 on my test system and observed srp group test cases all passed for SIW driver. So the failures reported could be PowerPC unique, but this is just a guess at this moment.
There is no delete
sysfs under /sys/class/srp_remote_ports/port-0\:1/
on ppc64le, and module srp_remote_ports was used by ibmvscsi which cannot be removed, we can use the workaround [2] to fix it.
[1]
# uname -r
6.8.0-0.rc3.26.fc40.ppc64le
# ls /sys/class/srp_remote_ports/port-0\:1/
device port_id power roles subsystem uevent
# lsmod | grep scsi_transport_srp
scsi_transport_srp 262144 1 ibmvscsi
# modinfo ibmvscsi
filename: /lib/modules/6.8.0-0.rc3.26.fc40.ppc64le/kernel/drivers/scsi/ibmvscsi/ibmvscsi.ko.xz
version: 1.5.9
license: GPL
author: Dave Boutcher
description: IBM Virtual SCSI
rhelversion: 9.99
srcversion: C68F72BE86AEC8C2E06395A
alias: vio:TvscsiSIBM,v-scsi*
depends: scsi_transport_srp
intree: Y
name: ibmvscsi
vermagic: 6.8.0-0.rc3.26.fc40.ppc64le SMP mod_unload patchable-function-entry relocatable
sig_id: PKCS#7
signer: Fedora kernel signing key
sig_key: 62:1F:EA:38:E0:AD:BC:8A:52:4A:27:EC:8E:35:C1:2F:55:43:96:2A
sig_hashalgo: sha256
signature: 31:87:D9:A2:E8:C6:70:FD:AD:57:E9:97:BE:E9:F5:11:19:B6:D5:D1:
7A:60:04:46:48:9B:15:C1:A1:11:6F:AE:F9:4E:F9:51:6B:3A:F4:47:
DD:26:A8:46:22:84:25:73:62:FA:1C:2E:4D:5D:04:10:9E:81:E9:F5:
5E:0A:15:A8:D5:37:0F:8A:0E:0C:00:AC:61:FF:33:61:A5:9A:86:59:
C3:01:48:97:13:51:B2:14:6E:0B:87:8F:B1:FC:AF:8F:A4:FA:1B:B0:
8F:33:05:A4:BD:B1:1D:95:5A:07:1A:8D:53:D0:6D:30:35:99:77:44:
73:58:CD:38:43:20:1F:2B:B2:42:4F:67:50:25:2C:FA:0E:FC:98:64:
DF:46:67:DB:98:F2:7D:8D:F3:F1:A9:F4:AC:BB:4E:DB:D1:EB:A4:0E:
6F:66:6E:7A:8D:66:02:99:26:9E:07:84:09:AB:D7:0F:05:FE:75:A5:
4D:D1:1D:F1:0E:C5:8B:C7:48:FF:BE:B0:C3:02:82:00:50:DD:6C:AC:
83:F5:44:97:29:7E:28:23:AE:A0:45:7B:B8:0F:AB:90:95:60:F9:01:
2F:2B:CB:BB:65:AD:45:55:8E:9B:AD:39:50:73:5F:79:E3:9D:0B:2D:
96:FE:E3:F4:5E:B1:C1:5B:DA:3E:AF:40:94:4E:14:51:AA:8F:BF:6D:
30:23:23:DD:70:CB:7C:3B:A0:26:66:DF:51:EB:3D:C0:FF:BD:D8:B8:
4C:2A:EC:E7:82:01:BD:22:5C:1E:57:5D:1C:F7:FD:8B:BD:01:0E:7D:
8A:1F:74:9A:C5:FA:78:79:FA:80:38:5E:5D:6F:0A:75:E7:47:BD:C3:
3C:9C:9C:D0:72:AC:5C:C1:29:D8:98:0F:F0:8A:7A:FB:76:3F:C1:72:
C1:0D:C4:ED:97:B1:83:88:AE:BA:3E:9E:D8:C5:0C:3D:12:FE:21:3E:
93:6C:83:13:59:D9:E9:25:72:6D:F7:0C:59:73:7D:B7:4E:3B:9F:73:
94:22:2B:D5:6C:B7:32:08:54:AB:C9:57:2A:C6:8D:6A:88:71:94:9B:
A3:9B:A6:E7:6D:27:B0:BD:D9:6B:60:F3:AE:3A:CF:BE:EF:CF:39:64:
87:06:9D:85:95:24:A3:0E:66:59:36:42:1D:2E:17:11:A4:5E:E9:0F:
17:BF:2D:62:E5:F5:EA:7A:15:3B:A2:16:FF:37:DA:B1:DF:FB:47:8E:
6A:07:5F:46:9A:AD:60:C3:07:0D:0C:5D:76:65:E2:BC:CA:61:24:20:
B9:7B:68:2F:14:FF:B0:EA:79:4C:09:80:EE:69:04:45:84:3C:88:53:
8E:15:B9:E8:29:7D:FC:95:60:4C:68:31
parm: max_id:Largest ID value for each channel [Default=64] (int)
parm: max_channel:Largest channel value [Default=3] (int)
parm: init_timeout:Initialization timeout in seconds (int)
parm: max_requests:Maximum requests for this adapter (int)
parm: fast_fail:Enable fast fail. [Default=1] (int)
parm: client_reserve:Attempt client managed reserve/release (int)
[2] https://github.com/yizhanglinux/blktests/commit/651a9d9174630ac87492c97e89c1d57d5474cedd
Thanks for the fix Yi Zhang. I ran the srp tests after applying your patch but still tests failed with below error.
# ./check srp/001
srp/001 (Create and remove LUNs) [failed]
runtime 4.785s ... 4.818s
--- tests/srp/001.out 2024-03-07 16:49:16.170133366 +0530
+++ /home/blktests/results/nodev/srp/001.out.bad 2024-03-08 16:53:55.160852461 +0530
@@ -1,3 +1,3 @@
+common/multipath-over-rdma: line 411: bonding_masters/addr_len: Not a directory
Configured SRP target driver
-count_luns(): 3 <> 3
-Passed
+SRP login failed
# cat /home/blktests/results/nodev/srp/001.out.bad
common/multipath-over-rdma: line 411: bonding_masters/addr_len: Not a directory
Configured SRP target driver
SRP login failed
Thanks for the fix Yi Zhang. I ran the srp tests after applying your patch but still tests failed with below error.
# ./check srp/001 srp/001 (Create and remove LUNs) [failed] runtime 4.785s ... 4.818s --- tests/srp/001.out 2024-03-07 16:49:16.170133366 +0530 +++ /home/blktests/results/nodev/srp/001.out.bad 2024-03-08 16:53:55.160852461 +0530 @@ -1,3 +1,3 @@ +common/multipath-over-rdma: line 411: bonding_masters/addr_len: Not a directory Configured SRP target driver -count_luns(): 3 <> 3 -Passed +SRP login failed # cat /home/blktests/results/nodev/srp/001.out.bad common/multipath-over-rdma: line 411: bonding_masters/addr_len: Not a directory Configured SRP target driver SRP login failed
Please also add this patch: https://github.com/yizhanglinux/blktests/commit/55b0193300d9e5777514d84fb908bca5e43066ba
I get this with both the patches applied.
# ./check srp/001
srp/001 (Create and remove LUNs) [failed]
runtime 4.678s ... 4.777s
--- tests/srp/001.out 2024-03-07 16:49:16.170133366 +0530
+++ /home/blktests/results/nodev/srp/001.out.bad 2024-03-08 17:09:22.183667901 +0530
@@ -1,3 +1,2 @@
Configured SRP target driver
-count_luns(): 3 <> 3
-Passed
+SRP login failed
# cat /home/blktests/results/nodev/srp/001.out.bad
Configured SRP target driver
SRP login failed
The srp/** tests fails on power with below error.
./check tests/srp/001
srp/001 (Create and remove LUNs) [failed] runtime 0.125s ... 0.132s --- tests/srp/001.out 2023-12-25 09:40:30.000000000 +0530 +++ /home/blktests-master/results/nodev/srp/001.out.bad 2024-01-01 15:26:32.804759400 +0530 @@ -1,3 +1,4 @@ -Configured SRP target driver -count_luns(): 3 <> 3 -Passed +tests/srp/rc: line 263: /sys/class/srp_remote_ports/port-0:1/delete: Permission denied +tests/srp/rc: line 263: /sys/class/srp_remote_ports/port-0:1/delete: Permission denied +modprobe: FATAL: Module scsi_transport_srp is in use. +failed to shutdown client tests/srp/rc: line 263: /sys/class/srp_remote_ports/port-0:1/delete: Permission denied tests/srp/rc: line 263: /sys/class/srp_remote_ports/port-0:1/delete: Permission denied modprobe: FATAL: Module scsi_transport_srp is in use.
cat /home/blktests/results/nodev/srp/001.out.bad
tests/srp/rc: line 263: /sys/class/srp_remote_ports/port-0:1/delete: Permission denied tests/srp/rc: line 263: /sys/class/srp_remote_ports/port-0:1/delete: Permission denied modprobe: FATAL: Module scsi_transport_srp is in use. failed to shutdown client