namjaejeon / ksmbd

ksmbd kernel server(SMB/CIFS server)
https://github.com/cifsd-team/ksmbd
280 stars 63 forks source link

Positive RDMA / smb-direct „taskman test“ between Win-client and linux-Server? #488

Open besterino opened 1 week ago

besterino commented 1 week ago

Hi!

Since I am struggling and cannot get it to work: has anyone successfully tested RDMA/smb-direct between a linux ksmbd server and a windows client, so that the windows taskmanager on the client does not show corresponding load (load between windows machine is zero or at least very low (kb/s)) while copying large files (e.g. 30GB) at much higher speeds of 1-2.5GB/s?

If so, could you please give some guidance on hardware, distro, specific steps, settings etc. used?

I have described my tries also here in another thread, but willing to start from scratch if there’s a setup / route more likely / proven to succeed:

https://github.com/namjaejeon/ksmbd/issues/466#issuecomment-2452961288

namjaejeon commented 1 week ago

Have you ever searched RDMA or smb-direct work in ISSUE of cifsd-team github?

https://github.com/cifsd-team/ksmbd/issues/542 https://github.com/cifsd-team/ksmbd/issues/604 and more... in ISSUE of cifsd-team github.

If so, could you please give some guidance on hardware, distro, specific steps, settings etc. used?

You should add "server multi channel support = yes" parameter in [global] section of your ksmbd.conf and build ksmbd.ko after turning CONFIG_SMB_SERVER_SMBDIRECT config on. The smb-direct(RDMA) feature is that the server responds from the client request. So please refer to your client settings or guide.

If it still doesn't work, please let me know.

besterino commented 1 week ago

Thank you for the input.

Yes, I had a look at various posts but still could not get it to work. Another one I found interesting but also without success: https://forum.level1techs.com/t/how-can-i-help-with-the-new-truenas-100g-testing/179052/8

As to my ksmbd.conf, it already has/had "server multi channel support = yes". My kernel was build with CONFIG_SMB_SERVER_SMBDIRECT enabled, at least according to /boot/config-6.11.0-9-generic.

On the windows client I tracked RDMA activity with Perfmon. It apparently tries to establish RDMA connections, but they fail.

The only time I see any smb_direct messages in dmesg is immediately after start of the service: [ +7.774154] ksmbd: selected SMB3_11 dialect idx = 3 [ +0.000009] ksmbd: selected SMB3_11 dialect idx = 3 [ +0.000179] ksmbd: smb_direct: ib device added: name rocep33s0f0 [ +0.000002] ksmbd: smb_direct: ib device added: name rocep33s0f1 [ +0.000354] ksmbd: smb_direct: init RDMA listener. cm_id=0000000084cd3fdd

namjaejeon commented 1 week ago

The only time I see any smb_direct messages in dmesg is immediately after start of the service:

Is there any error messages from ksmbd: smb_direct: ? This message("ksmbd: smb_direct: init RDMA listener. cm_id=0000000084cd3fdd") is the last one ?

besterino commented 2 days ago

Apologies for the late reply, I had not enough time for testing recently. As to your question: yes, that is the last message with smb_direct.

I did another test for RDMA functionality between windows und linux setting up NVME over fabric by this guide: https://www.reddit.com/r/truenas/comments/1fh3rfl/an_idiots_walkthrough_to_setting_up_nvmeofroce/?rdt=60944

That works like a charm, including RDMA performance counters being triggered when accessing the nvme-of target (please excuse the German OS):

NVMEoF_win_proxmox

namjaejeon commented 1 day ago

@besterino Could you test ksmbd RDMA after applying the following change ?

diff --git a/transport_rdma.c b/transport_rdma.c
index 29b2b43..d2ca328 100644
--- a/transport_rdma.c
+++ b/transport_rdma.c
@@ -2310,6 +2310,7 @@ out:
                }
        }

+       rdma_capable = true;
        return rdma_capable;
 }
besterino commented 21 hours ago

Apologies, not a coder here. How do I apply that change?

namjaejeon commented 18 hours ago

Sigh,, Okay. Can you dump packets using wireshark ? You need to capture it when windows client connect to ksmbd server. no need to catpure when copying/reading files in ksmbd share. It will cause too large dump file.