Closed farnoy closed 10 months ago
I captured traffic on the network, 192.168.100.1 is the host with ksmbd. It is initiating the disconnect which is causing a couple-seconds-long stall that can be seen in the file explorer copy stalling. After that, the client sends out a Tree Disconnect followed by more Read Requests. Given enough time, the file will get transferred, and as far as I can tell, it does not get downgraded to traditional IP traffic, it keeps attempting RDMA until it breaks.
Something is off with the timestamps in this capture, I don't think it's possible for the Tree Connect Response to arrive in the same microsecond as the request was sent.
What kernel version did you use ? When I am testing upload/download large files, there is no problem. And How big is a large file? I have tested it with a 10GB file.
6.4.12, I've tested uploads with small and large files 5MB-20GB and they work consistently well. For downloads, small files <10MB seem to complete instantly, within the initial burst and before the connection breaks off. Anything larger, like a 200MB file I tested on, will disconnect in this way and the download will stall until it gets reconnected & continued from there.
smbd max io size (G)
Maximum read/write size of SMB-Direct. Number suffixes are allowed.
Default: smbd max io size = 8MB
Can you check after decreasing smbd io size with 1M or 512K ?
[global]
smbd max io size = 512K
or
smbd max io size = 1M
I tried 512K
but it did not change anything. Still able to upload just fine, downloads get interrupted and stalled.
But I did try something else. My ConnectX3 adapter on the Windows client side is bottlenecked by being plugged into a PCIe x4 port. There is a bottleneck of ~26Gbps in my PCI subsystem before the 40GbE link can be saturated. So to test if it has an impact, I downgraded my link to 10GbE and both downloads & uploads work reliably.
Could the PCI bottleneck be affecting the behavior in some way?
In the meantime, I'll try to add a second link between the nodes, run them both at 10GbE and see if I can get multichannel & RDMA working to get to 2GB/s transfers at least
Could the PCI bottleneck be affecting the behavior in some way?
I don't know how to handle this in smb protocol(i.e. ksmbd). It seems linux RDMA driver should handle it. Basically, I think we need to configure the HW setup according to the specifications...
I'll look into it, I'm probably missing some form of congestion control on this RoCEv1 network. It makes sense that the issue shows up when a faster producer is sending too much for the consumer to handle but the other way around works fine.
Thanks for the help so far!
OK, I believe I fixed it by configuring global flow control on both sides. That is: ethtool -A $dev rx on tx on
on Linux and this property on Windows, which is probably Mellanox-specific: Get-NetAdapterAdvancedProperty -Name "Ethernet 7" -RegistryKeyword "*FlowControl"
.
I have no idea how this works because I don't see any pause frames or other congestion notifications but maybe this happens deeper in the IB stack.
Thanks again, ksmbd seems to work flawlessly and all the issues I've had so far were caused by something else.
I'm using Connectx 3's between a Linux ksmbd host and a Windows 10 client. Uploading large files to a share works well, with typical dmesg output:
But when I try to download something from the share, it shows up as this kind of output in dmesg:
And windows reports this as:
I believe I've set MTUs properly between the hosts, and RDMA does seem to work in at least one direction - all the symptoms confirm it's working, like perfmon counters. I'm new to RDMA so I totally could have made a mistake setting something up. One thing that stood out to me is the disparity between send & recv sizes, but I don't know if it's at all relevant: