cifsd-team / ksmbd

ksmbd kernel server(SMB/CIFS server)
151 stars 23 forks source link

Wonder the status of smb direct with windows clients #543

Open dz-cies opened 2 years ago

dz-cies commented 2 years ago

Hi, I'm trying to test smb direct with windows clients (with Mellanox Connectx-6 infiniband adapters) recently but have no luck. I'm able to build ksmbd and mount a share on windows server 2016 clients but rdma seems not enabled. I'm wondering whether this feature is already implemented. If it is implemented, what configuration is needed to enable the feature (`server multi channel support = yes' and anything else?)

The current available information looks a bit confusing. In the mail "[PATCH v8 00/13] ksmbd: introduce new SMB3 kernel server", it describes SMB direct as "Partially Supported. SMB3 Multi-channel is required to connect to Windows client" and SMB3 Multi-channel also "partially supported". In same mail, it reads 'SMB Direct is only currently possible with ksmbd (among Linux servers)'. So I guess windows clients are not possible yet. However, In Readme.md,SMB direct(RDMA) and Multi-channel are listed under Features implemented. And there seems to be already successful cases reported in other issues (#538 #529).

I would appreciate it if you would clarify this and share any progress about this feature.

dz-cies commented 2 years ago

Hi @hcbwiz , I read your issues (#538 and #533) and see you have done some work on testing smb direct with windows clients. Is it convenient for you to share more information with me? I didn't find any contact information on your homepage. If you would like to talk, here's my mail address: zw.xie@high-flyer.cn . Thank you.

KristijanL commented 2 years ago

i am interested in cross-platform RDMA, so please keep it public as i would like to participate in discussion, thanks.

namjaejeon commented 2 years ago

Let me know if you would like to participate in developing ksmbd's smb-direct support for windows client. If you just want to know smb-direct development situation, We can just say you need to wait more.

hclee commented 2 years ago

I would appreciate it if you would clarify this and share any progress about this feature.

Windows clients are not supported yet. I have tested smb direct only with Linux kernel's cifs filesystem. I have succeeded with interim patches, but I need time to complete it because of other work.

KristijanL commented 2 years ago

i can provide testing ksmbd smb-direct with windows 10 / 11 clients on mellanox 4 lx / 5 network adapters if needed.

namjaejeon commented 2 years ago

@KristijanL Sound good. We will request it to you if it is complete.

hcbwiz commented 2 years ago

Hi, You can try this: ksmbd-next-rdma

It bases on the ksmbd-next branch with some patchs from @hclee: dma-latest-v0.51 rdma-latest-v0.1

,and some dirty-fix for #538: fix force-shutdown

I used mellenx cx5(RoCEv2) with linux built-in mlnx5 driver The mlnx5 driver needs a modification:

--- drivers/infiniband/hw/mlx5/main.c.orig      2021-08-30 06:04:50.000000000 +0800
+++ drivers/infiniband/hw/mlx5/main.c   2021-11-11 01:59:52.607921507 +0800
@@ -171,8 +171,10 @@
                if (ibdev->is_rep)
                        break;
                write_lock(&roce->netdev_lock);
-               if (ndev->dev.parent == mdev->device)
+               if (ndev->dev.parent == mdev->device) {
                        roce->netdev = ndev;
+                       ib_device_set_netdev(&ibdev->ib_dev, ndev, 1);
+               }
                write_unlock(&roce->netdev_lock);
                break;

I tested the above by a windows 2016 server as the smb-direct cleint, and it worked well.

FYI, According to Implementing_SMBDirect_for_CIFS, windows uses port 445 for RoCE and port 5445 for iwarp

hcbwiz commented 2 years ago

@dz-cies

For multi-channel, I used RSS

For IB mode, it should work (I ever tried it). If you use mellanox driver directly, you can focely ksmbd_rdma_capable_netdev() return true without driver modification.

my smb.conf

[global]
        netbios name = SMBD
#       workgroup = SKITTLES
#       interfaces = ibs4
#       bind interfaces only = Yes
#       ipc timeout = 20
        #map to guest = Bad User
        server multi channel support = yes

[mnt]
        comment = content server share
        path = /mnt
        browseable = yes
        read only = no
        writeable = yes
        create mask = 0777
        directory mask = 0777
        force user= root
        force group = root

For SMB direct, does windows clients require the CIFS server with "multi-channel enabled"? I didn't dig it too much.

dz-cies commented 2 years ago

@namjaejeon @hclee Thank you for the update. Good news to hear about the interim patches. I'm willing to do any test about this feature with Connectx-6 adapters if needed.

@hcbwiz Thank you for sharing the information in detail. I will try this and get back to you. By the way, I still want to ask if it is possible to get personal contact with you. I work for a company in Hangzhou building its own HPC since last year. We are longing for any experts on ib, storage and so. Maybe we can have a talk on wechat.

wqlxx commented 2 years ago

Hi, You can try this: ksmbd-next-rdma

It bases on the ksmbd-next branch with some patchs from @hclee: dma-latest-v0.51 rdma-latest-v0.1

,and some dirty-fix for #538: fix force-shutdown

I used mellenx cx5(RoCEv2) with linux built-in mlnx5 driver The mlnx5 driver needs a modification:

--- drivers/infiniband/hw/mlx5/main.c.orig      2021-08-30 06:04:50.000000000 +0800
+++ drivers/infiniband/hw/mlx5/main.c   2021-11-11 01:59:52.607921507 +0800
@@ -171,8 +171,10 @@
                if (ibdev->is_rep)
                        break;
                write_lock(&roce->netdev_lock);
-               if (ndev->dev.parent == mdev->device)
+               if (ndev->dev.parent == mdev->device) {
                        roce->netdev = ndev;
+                       ib_device_set_netdev(&ibdev->ib_dev, ndev, 1);
+               }
                write_unlock(&roce->netdev_lock);
                break;

I tested the above by a windows 2016 server as the smb-direct cleint, and it worked well.

FYI, According to Implementing_SMBDirect_for_CIFS, windows uses port 445 for RoCE and port 5445 for iwarp

@hcbwiz I have tried what you said. I compiled kernel 5.16.0-rc1 amd ksmbd-next-rdma with gcc-9.3.0 .I tested the above by a windows 10 as the smb-direct cleint with CrystalDiskMark8.0.4 to run a disk io benchmark, and I got kernel panic.


[ 1460.890421] Modules linked in: ksmbd(OE) libdes rfkill sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ext4 irqbypass crct10dif_pclmul crc32_pclmul vfat ghash_clmulni_intel fat mbcache jbd2 iTCO_wdt aesni_intel mlx5_ib crypto_simd iTCO_vendor_support cryptd rapl ib_uverbs mei_me i2c_i801 lpc_ich intel_cstate mei joydev pcspkr sg input_leds i2c_smbus acpi_ipmi mfd_core ipmi_si wmi ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter ip_tables xfs libcrc32c sd_mod mlx5_core ast drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ttm drm igb ahci libahci libata nvme mlxfw nvme_core crc32c_intel pci_hyperv_intf ptp dca i2c_algo_bit pps_core t10_pi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ksmbd]
[ 1460.890533] CPU: 5 PID: 5244 Comm: kworker/5:26 Tainted: G S         OE     5.16.0-rc1 #8
[ 1460.890540] Hardware name: Ruijie Networks Co., Ltd. RG-RCD16000Pro-3D/Z10PG-D24 Series, BIOS 3305 08/09/2018
[ 1460.890543] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd]
[ 1460.890610] RIP: 0010:smb2_session_logoff+0xd7/0xe0 [ksmbd]
[ 1460.890635] Code: 00 00 00 48 8b 7b 08 e8 07 63 ff ff 48 c7 43 08 00 00 00 00 49 8b 04 24 c7 40 40 04 00 00 00 31 c0 5b 41 5c 41 5d 41 5e 5d c3 <0f> 0b e9 70 ff ff ff 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56
[ 1460.890640] RSP: 0018:ffffab934b45fde0 EFLAGS: 00010206
[ 1460.890645] RAX: 0000000044000000 RBX: ffff971f8da9b200 RCX: ffff9714313d14e0
[ 1460.890649] RDX: ffff971f854fe800 RSI: 0000000000000002 RDI: ffff97143c0d3400
[ 1460.890653] RBP: ffffab934b45fe00 R08: 0000000000000001 R09: ffff971f854fe800
[ 1460.890656] R10: ffff97140d1c2d3c R11: 0000000000000018 R12: ffff97143c0d3400
[ 1460.890660] R13: ffff971404a96000 R14: ffff971f854fe800 R15: ffffffffc0664b70
[ 1460.890663] FS:  0000000000000000(0000) GS:ffff971f4f740000(0000) knlGS:0000000000000000
[ 1460.890666] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1460.890670] CR2: 00007f85b1ee02a0 CR3: 00000016b860a004 CR4: 00000000001706e0
[ 1460.890674] Call Trace:
[ 1460.890678]  <TASK>
[ 1460.890683]  __handle_ksmbd_work+0x12c/0x380 [ksmbd]
[ 1460.890706]  handle_ksmbd_work+0x2d/0x50 [ksmbd]
[ 1460.890727]  process_one_work+0x1ba/0x390
[ 1460.890739]  worker_thread+0x50/0x3a0
[ 1460.890746]  kthread+0x17a/0x1a0
[ 1460.890758]  ? process_one_work+0x390/0x390
[ 1460.890762]  ? set_kthread_struct+0x40/0x40
[ 1460.890770]  ret_from_fork+0x22/0x30
[ 1460.890783]  </TASK>
[ 1460.890785] ---[ end trace 9af9766e1447b385 ]---```
Can you give me some suggestions?
hclee commented 2 years ago

@wqlxx Does this oops happen even with TCP transport?

wqlxx commented 2 years ago

@hclee I tested CrystalDiskMark without rdma. These is no oops in linux server, But the windows 10 smb client needs to log in again. ksmbd_error.png

hcbwiz commented 2 years ago

@wqlxx According to ms-smbd, only these products support SMB Direct:

Windows 10 v1511 Enterprise operating system (client role only) 
Windows 10 v1607 Educational operating system (client role only)

And, I add a module parameter, please make sure you load the module with 'rdma=1'

It seems that you tested it with "multi-channel". Anyway, you can check the connection status by using the command: Get-SmbMultichannelConnection

圖片

hcbwiz commented 2 years ago

@wqlxx the upstream gets new updates. I also sync these updates , you can check it here: ksmbd-rdma Note: just cherry-pick and compile with errors.

wqlxx commented 2 years ago

@hcbwiz Thank you so much for your help. I will try to remove multichannel option and new update of ksmbd.

wqlxx commented 2 years ago

@hcbwiz I used ksmbd-next-rdma branch(commit ID is 6650f9df53e545bf977c6f755ce64c7a07aa209c), and I found this branch can be smb direct(I can't see the data of Ethernet in windows Task Manager). task_manager_without_data_flow.png But When I checkout to ksmbd-rdma branch (commit ID is ef10fc816f2fceb375ee09054717611cb127b916) which is the latest, I can see the data of Ethernet in windows Task Manager. 4119a716964f942fa52f2cf6a006d1a

I am now very confused, why the old version can support smb direct?

hcbwiz commented 2 years ago

@wqlxx about using kambd-next-rdma branch, do you mean rdma can work if you disable multi-channel?

If you both enable multi-chanel and rdma, your windows client chooses to use multi-channel?

wqlxx commented 2 years ago

@hcbwiz

@wqlxx about using kambd-next-rdma branch, do you mean rdma can work if you disable multi-channel?

Using kambd-next-rdma branch, rdma(smb direct) can work both enable and disable multi-channel. With enable multi-channel, I will have the kernel oops, just like https://github.com/cifsd-team/ksmbd/issues/543#issuecomment-971316254

If you both enable multi-chanel and rdma, your windows client chooses to use multi-channel?

I don’t need multi-channel for now. So I did not open this function in windows.

hcbwiz commented 2 years ago

interesting... In my environment, i use windows server 2016. Multi-channel(signle port with rss feature) and rdma cannot work together.

When you enable both functions and just connect to your ksmbd server, what is the status by using the command? Get-SmbMultichannelConnection

hcbwiz commented 2 years ago

@wqlxx can you help check enabling multi-channel only? Does it still trigger kernel oops?

Note: just load module without "rdma=1"

consp commented 2 years ago

Just as a validation: Forcing rdma_capable = true; on a CX-4 VPI card works just fine. The problem is detecting the ibdev as mentioned by @hcbwiz which can be fixed by the modification of the mlx5 driver to include the ib_device_set_netdev call.

image Two quick tests on the same SSD array iSCSI and smbdirect with RDMA perform about the same, ignore the slower write in this image. It also shows up nicely in the performance counters. Client is Windows 10 Pro for Workstations build 19042. Server is a debian build with 5.14 kernel with @hcbwiz ksmbd-rdma branch and a patch to force rdma_capable = true; due to the mlx5 driver issues.

RSS should have been enabled but so far no oops. I'll test if it keeps working or I find any errors, so far it looks good.

When you enable both functions and just connect to your ksmbd server, what is the status by using the command?

I only get RDMA on Windows as a client, never RSS if RDMA is enabled and working. As a test I enabled it in smbd by forcing it to advertise both RSS and RDMA as capabilities on the server. Windows does show both with the Get-SmbMultichannelConnection in that case, it then tries RDMA and fails due to it not being available and switches to RSS showing only RSS true. With ksmbd it only shows RDMA.

hcbwiz commented 2 years ago

@consp Does CrystalDiskMark support by-pass buffering (direct io)?

Currently, ksmbd doesn't support SMB2 FILE_NO_INTERMEDIATE_BUFFERING. Thus all io still go through linux page cache.

It seems linux vfs_xxx APIs doesn't support direct io in a normal way.

KristijanL commented 2 years ago

when forcing ksmbd to use RDMA i get kernel panic, using: commit 6650f9df53e545bf977c6f755ce64c7a07aa209c (HEAD -> ksmbd-next-rdma, origin/ksmbd-next-rdma)

built-in linux mlnx5 driver system info:

root@truenas:~# cat /etc/debian_version
11.0
root@truenas:~# uname -a
Linux truenas 5.10.70+truenas #1 SMP Wed Nov 3 18:30:34 UTC 2021 x86_64 GNU/Linux
root@truenas:~# lspci | grep Mell
04:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
04:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]

patched ksmbd_rdma_capable_netdev

        ibdev = ib_device_get_by_netdev(netdev, RDMA_DRIVER_UNKNOWN);
        if (ibdev) {
                // if (rdma_frwr_is_supported(&ibdev->attrs))

                rdma_capable = true;
                ib_device_put(ibdev);
        }

without the patch i only got RSS to work.

using windows 10 client

Windows 10 Pro for Workstations
21H1
wqlxx commented 2 years ago

@hcbwiz

interesting... In my environment, i use windows server 2016. Multi-channel(signle port with rss feature) and rdma cannot work together.

When you enable both functions and just connect to your ksmbd server, what is the status by using the command? Get-SmbMultichannelConnection

multi_channel-rdma_1.png

@wqlxx can you help check enabling multi-channel only? Does it still trigger kernel oops?

Note: just load module without "rdma=1" without "rdma=1" and enable multi-channel only, There is no oops. multi_channel-rdma_0.png

dz-cies commented 2 years ago

@hcbwiz I used ksmbd-next-rdma branch(commit ID is 6650f9d), and I found this branch can be smb direct(I can't see the data of Ethernet in windows Task Manager). (https://camo.githubusercontent.com/4ae1f3aea1d4fb9d2234045def7a6ea32197465572756bb30904386d12b94066/68747470733a2f2f692e6c6f6c692e6e65742f323032312f31312f31382f556a4741735875596845564b784e502e706e67) But When I checkout to ksmbd-rdma branch (commit ID is ef10fc8) which is the latest, I can see the data of Ethernet in windows Task Manager.

Same issue, the latest branch doesn't work. The established connection shows RSS Capable True and RDMA Capable False.

hcbwiz commented 2 years ago

I will setup my testing equipment, and will update you later.

consp commented 2 years ago

@consp Does CrystalDiskMark support by-pass buffering (direct io)?

As far as I know not with SMB shares since you can only run it on those as user, not administrator. It was the easiest thing I had available for testing but moving large files and many lose files is definitely working and in range of the results from CrystalDiskMark.

edit: with diskspd and caching disabled the results are as follows so pretty good Write:

diskspd  -d60 -c16G -t16 -o32 -b32k -L -Sh -w100  y:\test.dat
RDMA Disabled, RSS Enabled
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
total:       62435786752 |      1905389 |     992.07 |   31746.36 |   16.127 |     3.402
RDMA Enabled, RSS Disabled
-----------------------------------------------------------------------------------------------------
total:       94372298752 |      2880014 |    1499.79 |   47993.16 |   10.667 |     4.874

read (which maxes the io speed due to sufficiently large zfs arc):

diskspd  -d60 -c16G -t16 -o32 -b32k -L -Sh -w0  y:\test.dat
RSS Enabled, RDMA Disabled:
-----------------------------------------------------------------------------------------------------
total:      122677985280 |      3743835 |    1949.68 |   62389.82 |    8.205 |     5.607
RSS Disabled, RDMA Enabled:
total:      134674677760 |      4109945 |    2140.41 |   68492.98 |    7.475 |     2.907

So far no issues with crashes/oopsies etc. Surprisingly stable. Sometimes windows only opens 1 connection and RDMA speed/ios are half, I'm not sure why though.

Thus all io still go through linux page cache.

The backing filesystem is (open)zfs, if I'm not mistaken it completely bypasses the normal linux page cache.

consp commented 2 years ago

Same issue, the latest branch doesn't work. The established connection shows RSS Capable True and RDMA Capable False.

Is that before or after starting a file transfer? Initially I get RDMA advertised even if it fails to connect and it shows RSS enabled, RDMA disabled afterwards.

Windows Event viewer might show some more information if RDMA was advertised by ksmbd in Application&Services->Microsoft->Windows->SMBClient->Connectivity . Maybe there is a better way of getting this info but I'm not that experienced with windows.

hcbwiz commented 2 years ago

@consp Thanks for your testing..

There are two "buffering":

                windows APP --> FS operation --> CIFS  ---->  KSMBD --> VFS layer --> backend storage

If the windows APP uses FILE_FLAG_NO_BUFFERING, its IO will by-pass windows FS buffering, and set SMBD2 FILE_NO_INTERMEDIATE_BUFFERING.

If ksmbd can handle"FILE_NO_INTERMEDIATE_BUFFERING", it also can by-pass the FS buffering and write directly to backend storage.

Actually, I used a nvme raid with XFS, it has almost 20GB/s throughput at local test by "fio direct=1". However, if I used it over SMB direct (kbmsd with rdma) , I only got 7.x GB/s throughput with testing "large file write" Then I investigated the issue, and found the bottleneck: vfs page cache layer. Note: my ksmbd server only has 16G dram.

I just check the code of (open)zfs. hmm... it has its owned "cache mechanism (ARC?)". I'm interested in the efficiency of ZFS cache mechanism.. :)

dz-cies commented 2 years ago

Same issue, the latest branch doesn't work. The established connection shows RSS Capable True and RDMA Capable False.

Is that before or after starting a file transfer? Initially I get RDMA advertised even if it fails to connect and it shows RSS enabled, RDMA disabled afterwards.

Windows Event viewer might show some more information if RDMA was advertised by ksmbd in Application&Services->Microsoft->Windows->SMBClient->Connectivity . Maybe there is a better way of getting this info but I'm not that experienced with windows.

The information from the event viewer looks clear. My environment is Connect X6 adapter , Windows 2016 and MLNX_WinOF2 2.70.51000.

Microsoft-Windows-SMB Client/Connectivity
There is an RDMA interface available, but the client cannot connect to the server via RDMA transmission.
Both the client and the server have RMDA (SMB Direct) adapters, but there is a problem with the connection, so the client has to fall back to use TCP/IP SMB (non-RDMA).
consp commented 2 years ago
Microsoft-Windows-SMB Client/Connectivity
There is an RDMA interface available, but the client cannot connect to the server via RDMA transmission.
Both the client and the server have RMDA (SMB Direct) adapters, but there is a problem with the connection, so the client has to fall back to use TCP/IP SMB (non-RDMA).

Are the all the required UDP Ports open? Maybe something is blocked somewhere along the network path. I'm using direct connection so no added hops. Have you tried testing the RDMA connection with something like rping & nd_rping windows to linux?

Then I investigated the issue, and found the bottleneck: vfs page cache layer. Note: my ksmbd server only has 16G dram.

Sounds probable. My system is about 5 times faster read, 3 times with write, with single thread fio than possible with ksmbd but I wasn't expecting any wonders. Still good performance.

The biggest difference I'm seeing with most people in this thread is that I'm running a 5.14 kernel and most are on 5.10 or 5.16. Could that be an issue for those who do not get it to work?

dz-cies commented 2 years ago

Are the all the required UDP Ports open? Maybe something is blocked somewhere along the network path. I'm using direct connection so no added hops.

I don't think this is a firewall issue because the old branch works. Anyway I closed the firewall on both sides and the result is the same.

Have you tried testing the RDMA connection with something like rping & nd_rping windows to linux?

Rping works both directions.

The biggest difference I'm seeing with most people in this thread is that I'm running a 5.14 kernel and most are on 5.10 or 5.16. Could that be an issue for those who do not get it to work?

I'm on 5.8 and actually I made it work (the old branch) on 5.4 by a simple patch to kernel src.

diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index c9a9b6dcbf1b..8ea12ee03f80 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -405,6 +405,7 @@ struct ib_device_attr {
        int                     max_send_sge;
        int                     max_recv_sge;
        int                     max_sge_rd;
+       int                     max_sgl_rd;
        int                     max_cq;
        int                     max_cqe;
        int                     max_mr;
hcbwiz commented 2 years ago

Hi,

I have picked the latest patch from here to ksmbd-rdma.

It works in my environment (windows 2016 server).

BTW, I also tested SMB direct without enabling multi-channel. In the smb.conf,

    server multi channel support = no

Without multi-channel support, windows 2016 server cannot activate RDMA transportation.

wqlxx commented 2 years ago

@hcbwiz I used mellanox cx4 lx card with mlx5_core driver. But I always got ibdev always is NULL in ksmbd_rdma_capable_netdev, because ib_device_get_by_netdev can not find any ib devices with netdev. I am very confused, how to open rdma on centos7. I use mlx5_core, and not install mellanox offical MLNX_OFED driver.

hcbwiz commented 2 years ago

@wqlxx the kernel built-in and mlnx ofed drivers don't register it.

you can modify the driver code as i mentioned Or just modify thr ksmbd code: let "ksmbd_rdma_capable_netdev" always returns true

consp commented 2 years ago

I use this kind of temporary hack in ksmbd_rdma_capable_netdev since I have multiple devices:

--- a/transport_rdma.c
+++ b/transport_rdma.c
@@ -2099,9 +2099,18 @@ bool ksmbd_rdma_capable_netdev(struct net_device *netdev)

        ibdev = ib_device_get_by_netdev(netdev, RDMA_DRIVER_UNKNOWN);
        if (ibdev) {
-               if (rdma_frwr_is_supported(&ibdev->attrs))
+               ksmbd_debug(RDMA, "Is IB Device");
+               if (rdma_frwr_is_supported(&ibdev->attrs)) {
+                       ksmbd_debug(RDMA, "RDMA Capable");
                        rdma_capable = true;
+               }
                ib_device_put(ibdev);
+       } else {
+               ksmbd_debug(RDMA, "Not RDMA Capable: %s", netdev->name);
+#define RDMA_CAPABLE_DEVICE_HACK "ens15"
+               if (strcmp(netdev->name, RDMA_CAPABLE_DEVICE_HACK) == 0) {
+                       rdma_capable = true;
+               }
        }
        return rdma_capable;
 }

This is not portable and a quick and dirty hack but unless the mlx5_core driver gets patched this will have to do for now for my testing.

wqlxx commented 2 years ago

@consp @hcbwiz Thanks, I will try what you two said.

dz-cies commented 2 years ago

Hi,

I have picked the latest patch from here to ksmbd-rdma.

It works in my environment (windows 2016 server).

BTW, I also tested SMB direct without enabling multi-channel. In the smb.conf,

    server multi channel support = no

Without multi-channel support, windows 2016 server cannot activate RDMA transportation.

I tried this branch but it still doesn't establish RDMA connection in my environment. Hope to see feedbacks from other guys about this branch.

Now I've turned to use RSS without RDMA, as I find it gives equal (or even higher) throughput. But I encountered two problems. I'm not sure whether they're related.

  1. The default value of ConnectionCountPerRssNetworkInterface (Get-SmbClientConfiguration) is 4 and I'm able to achieve 4x1.5GB/s with this configuration because 4 connections are established. But if I set it to a larger number, the throughput does not grow linearly because connections are not always successfully created. There's high possibility the connection fails with 'authentication failed'. Debug shows some connections failed to pass the test if (memcmp(ntlmv2->ntlmv2_hash, ntlmv2_rsp, CIFS_HMAC_MD5_HASH_SIZE) != 0) in function ksmbd_auth_ntlmv2 in auth.c which means password mismatch I believe, but of course the password is correct as all connections are using the same one. As a result, I can't stably get desired number of connections and the throughput varies largely. It varies from 1.5GB/s (1 connection) ~ 8.5GB/s (8 connections in the best case but seems not all connections are fully working or it will be 12GB/s).
  2. The following kernel trace from the server side comes out nearly every time when I set ConnectionCountPerRssNetworkInterface to a large number (say, 16) and try to run test with fio. After this trace appears, ksmbd.control -s and rmmod ksmbd hangs forever. Seems there's some memory leak.

kernel: [250922.771188] invalid opcode: 0000 [#2] SMP NOPTI kernel: [250922.771192] CPU: 88 PID: 36881 Comm: kworker/88:2 Tainted: G D OE kernel: [250923.733217] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] kernel: [250923.733257] RIP: 0010:slab_free+0x189/0x330 kernel: [250923.733284] Code: 00 48 89 c7 fa 66 0f 1f 44 00 00 f0 49 0f ba 2c 24 00 72 65 4d 3b 6c 24 20 74 11 49 0f ba 34 24 00 57 9d 0f 1f 44 00 00 eb 9f <0f> 0b 49 3b 5c 24 28 75 e8 48 8b 44 24 28 49 89 4c 24 28 49 89 44 kernel: [250923.733296] RIP: 0010:__slab_free+0x189/0x330 kernel: [250923.733375] RSP: 0018:ffffb927c4adbcc0 EFLAGS: 00010246 kernel: [250923.735152] Code: 00 48 89 c7 fa 66 0f 1f 44 00 00 f0 49 0f ba 2c 24 00 72 65 4d 3b 6c 24 20 74 11 49 0f ba 34 24 00 57 9d 0f 1f 44 00 00 eb 9f <0f> 0b 49 3b 5c 24 28 75 e8 48 8b 44 24 28 49 89 4c 24 28 49 89 44 kernel: [250923.736915] RAX: ffff9fb68c751588 RBX: 00000000820001f0 RCX: ffff9fb68c751588 Nov 22 14:27:30 kernel: [250923.736916] RDX: ffff9fb68c751588 RSI: ffffe477b831d440 RDI: ffff9fb6c0807b80 Nov 22 14:27:30 kernel: [250923.736917] RBP: ffffb927c4adbd58 R08: 0000000000000001 R09: ffffffffc0ec2df4 Nov 22 14:27:30 kernel: [250923.736918] R10: ffff9fb68c751588 R11: 0000000000000001 R12: ffffe477b831d440 Nov 22 14:27:30 kernel: [250923.736919] R13: ffff9fb68c751588 R14: ffff9fb6c0807b80 R15: ffff9fb616974c00 Nov 22 14:27:30 kernel: [250923.736921] FS: 0000000000000000(0000) GS:ffff9fb6ce200000(0000) knlGS:0000000000000000 kernel: [250923.736923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: [250923.736926] CR2: 00007fff939a8e78 CR3: 0000007ccf3e8000 CR4: 0000000000340ee0 Nov 22 14:27:30 kernel: [250923.740675] RSP: 0018:ffffb927c387bcc0 EFLAGS: 00010246 kernel: [250923.742585] Call Trace: kernel: [250923.744516] RAX: ffff9fb67591a200 RBX: 00000000820001f5 RCX: ffff9fb67591a200 Nov 22 14:27:30 kernel: [250923.746468] ? netlink_sendskb+0x42/0x50 kernel: [250923.748382] RDX: ffff9fb67591a200 RSI: ffffe477b7d64680 RDI: ffff9fb6c0807b80 Nov 22 14:27:30 kernel: [250923.750291] ? ksmbd_free_user+0x24/0x40 [ksmbd] kernel: [250923.752222] RBP: ffffb927c387bd58 R08: 0000000000000001 R09: ffffffffc0ec2df4 fload esp6 esp4_offload esp4 xfrm_algo mlx5_fpga_tools(OE) mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) xfs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua amd64_edac_mod(E) edac_mce_amd(E) kvm_amd kvm ipmi_ssif input_leds ccp k10temp(E) ipmi_si ipmi_devintf ipmi_msghandler mac_hid(E) sch_fq_codel nfsd(OE) knem(OE) auth_rpcgss nfs_acl lockd grace sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper ast drm_vram_helper i2c_algo_bit ttm drm_kms_helper syscopyarea mlx5_core(OE) sysfillrect sysimgblt pci_hyperv_intf fb_sys_fops tls mpt3sas mlxfw(OE) drm vfio_mdev raid_class kernel: [250922.770725] ahci mdev(OE) scsi_transport_sas nvme libahci i2c_piix4(E) mlx_compat(OE) nvme_core hid_generic usbhid hid kernel: [250922.771185] ---[ end trace 4994c5df71786ecb ]---

namjaejeon commented 2 years ago

@wqlxx Can you show me the below info using ibv_devices and ibv_devinfo on your centOS ?

linkinjeon@linkinjeon-Z10PA-D8-Series:~$ ibv_devices 
    device                 node GUID
    ------              ----------------
    iwp1s0f4            00074341eb200000
linkinjeon@linkinjeon-Z10PA-D8-Series:~$ ibv_devinfo 
hca_id: iwp1s0f4
    transport:          iWARP (1)
    fw_ver:             1.25.6.0
    node_guid:          0007:4341:eb20:0000
    sys_image_guid:         0007:4341:eb20:0000
    vendor_id:          0x1425
    vendor_part_id:         25601
    hw_ver:             0x0
    board_id:           1425.6401
    phys_port_cnt:          2
        port:   1
            state:          PORT_INIT (2)
            max_mtu:        4096 (5)
            active_mtu:     1024 (3)
            sm_lid:         0
            port_lid:       0
            port_lmc:       0x00
            link_layer:     Ethernet

        port:   2
            state:          PORT_INIT (2)
            max_mtu:        4096 (5)
            active_mtu:     1024 (3)
            sm_lid:         0
            port_lid:       0
            port_lmc:       0x00
            link_layer:     Ethernet
namjaejeon commented 2 years ago

@dz-cies Why do you turn on RSS mode ? When I checked it, There is no performance gain from it. And there was race condition issue in session setup. BTW Multichannel work fine without RSS mode ?

namjaejeon commented 2 years ago

It works in my environment (windows 2016 server). Without multi-channel support, windows 2016 server cannot activate RDMA transportation.

@hcbwiz Can you tell me again clearly ? Does it mean that RDMA does not work without multi-channel in Windows 2016 server? Or work?

hcbwiz commented 2 years ago

@namjaejeon , in my environment, i need to turn on "multi channel" feature, and windows server 2016 just can activate "smb direct". however, it just starts one rdma session rather than multiple rdma sessions.

wqlxx commented 2 years ago

@namjaejeon ibv.png

wqlxx commented 2 years ago

@hcbwiz You are right, windows server can use RDMA function normally as a smb client. But windows 10 as smb client can not use SMB Direct with Ksmbd in ksmbd_rdma branch.

namjaejeon commented 2 years ago

@wqlxx Could you explain more why smb direct of windows 10 client is not connected with ksmbd ?

hcbwiz commented 2 years ago

As i know, for desktop versions, only "windows 10 pro for workstation" and windows 11 support smb direct

dz-cies commented 2 years ago

@dz-cies Why do you turn on RSS mode ? When I checked it, There is no performance gain from it. And there was race condition issue in session setup.

My environment is rather simple that both server and client have one ConnectX6 200Gb/s adapter. My purpose is to saturate the adapters if possible. The bandwidth without RSS or RDMA is 1.5GB/s in my test. I guess it is limited by two factors: one connection only is established without RSS and the connection is TCP without RDMA.

As mentioned above, RDMA and RSS seem not work together. I tried RDMA and it improved to 6GB/s. And I tried RSS. when multiple connections established, the throughput reached connection_number x 1.5GB/s. The best performance I've reached is 8GB/s. In both cases there're some issues prevent it to provide the stable throughput. In the RSS case, I encountered the race condition issue in session setup as you mention. I will be cheerful if it is possible to reach the same throughput without enabling RSS mode but I don't know how to achieve it.

BTW Multichannel work fine without RSS mode ?

I didn't test with multiple interfaces so I'm afraid I can't answer this. I do test this in my environment with only one interface in both side but only one connection is established. As per my understanding about RSS, this is not out of expectation?

wqlxx commented 2 years ago

@hcbwiz I have test smb direct in windows 11 pro insider preview, It's doesn`t work.



Interface Index RSS Capable RDMA Capable Speed   IpAddresses                                Friendly Name
--------------- ----------- ------------ -----   -----------                                -------------
9               False       False        0  bps  {fe80::5c60:38fe:3bf8:1a17}                Ethernet     
14              False       False        1 Gbps  {fe80::e873:1dd0:2ae5:8112, 172.28.91.120} Ethernet 2   
22              True        False        10 Gbps {fe80::71d3:bbb1:94fc:97c0, 10.0.1.2}      Ethernet 3   
26              True        False        10 Gbps {fe80::6896:b5de:18d2:dbef, 10.0.0.2}      Ethernet 4   

PS C:\Windows\system32> Get-NetAdapterRdma

Name                      InterfaceDescription                     Enabled     Operational     PFC        ETS       
----                      --------------------                     -------     -----------     ---        ---       
Ethernet 4                Mellanox ConnectX-4 Lx Ethernet Adapt... True        True            False      False     
Ethernet 3                Mellanox ConnectX-4 Lx Ethernet Adapter  True        True            False      False     ```
namjaejeon commented 2 years ago

As I know, There is windows 11 pro workstation version. It will support smb-direct, not window 11 pro :p)

wqlxx commented 2 years ago

@namjaejeon You are right. I have tried windows 10 pro workstation as smb client with linux smb server(ksmbd), it's support smb dierct.