Closed yizhanglinux closed 2 months ago
BTW, I tried on another server and it cannot be reproduced, but can be reproduced on the reproduced server within 100 times.
@yizhanglinux Thanks for the reports. I will take a look in the nbd/001 and nbd/002 failures later. Now I'm fed up with other failures and reviews...
I took a look in this issue and #141. These two failures both have the error message "Socket failed: Connection refused". I think this means ECONNREFUSED. I grepped this in the kernel code, but it was not found in the nbd driver. They are found in the network sub-systems. So I guess the error means the nbd-server socket is not yet ready when nbd-client connects by some reason.
I guess the fix should be in the test case, and it is to wait the nbd-server socket gets ready before nbd-client connects. I have created a fix trial patch at my nbd branch in my blktests repo.
@yizhanglinux Are the failures recreated on your test machines? If so, could you apply my fix trial patch and see if it avoids the failures?
Yes, I tried the patch on the reproduced server, and couldn't reproduce the nbd/001 nbd/002 failure now with more than 2000 times.
======================================2322
nbd/001 (resize a connected nbd device) [passed]
runtime 1.300s ... 1.330s
nbd/002 (tests on partition handling for an nbd device) [passed]
runtime 1.666s ... 1.967s
@yizhanglinux Thank you for the confirmation! I posted the fix to relevant lists for reviews.
The modified fix 0c3dfdd is now in the master branch. Let me close this case.
CKI reported nbd/002 failure recently on 6.10.0-rc4, it also can be reproduced on 6.9.
=================================15 nbd/002 (tests on partition handling for an nbd device) [failed] runtime 1.507s ... 4.495s --- tests/nbd/002.out 2024-06-18 21:37:53.351495157 -0400 +++ /root/blktests/results/nodev/nbd/002.out.bad 2024-06-18 22:08:20.656824752 -0400 @@ -1,4 +1,3 @@ Running nbd/002 Testing IOCTL path -Testing the netlink path -Test complete +Didn't have partition on ioctl path
[root@dell-r640-053 blktests]# cat results/nodev/nbd/002.full Error: Socket failed: Connection refused stat: cannot statx '/dev/nbd0p1': No such file or directory Negotiation: ..size = 10240MB bs=512, sz=10737418240 bytes stat: cannot statx '/dev/nbd0p1': No such file or directory stat: cannot statx '/dev/nbd0p1': No such file or directory stat: cannot statx '/dev/nbd0p1': No such file or directory disconnect, sock, done [root@dell-r640-053 blktests]# cat results/nodev/nbd/002.out.bad Running nbd/002 Testing IOCTL path Didn't have partition on ioctl path
dmesg: [ 741.052034] run blktests nbd/002 at 2024-06-18 22:08:16 [ 741.585612] nbd0: detected capacity change from 0 to 20971520 [ 745.603328] block nbd0: NBD_DISCONNECT [ 745.603363] block nbd0: Disconnected due to user request. [ 745.603366] block nbd0: shutting down sockets