cifsd-team / ksmbd

ksmbd kernel server(SMB/CIFS server)
152 stars 23 forks source link

SMB1 crash on OpenWrt (MIPS) #440

Closed xdarklight closed 3 years ago

xdarklight commented 3 years ago

I can reproduce the following crash 100% when having all ksmbd debug logging disabled. When I have ksmbd's debug logging ("all") active then it rarely happens.

[   51.346379] br-lan: port 2(wlan1) entered blocking state
[   51.351032] br-lan: port 2(wlan1) entered forwarding state
[  101.279669] Kernel bug detected[#1]:
[  101.281799] CPU: 1 PID: 528 Comm: kworker/1:3 Not tainted 5.4.66 #0
[  101.288159] Workqueue: ksmbd-io ksmbd_ipc_init [ksmbd]
[  101.293202] $ 0   : 00000000 00000001 00000000 808f3bfc
[  101.298412] $ 4   : 807667b4 00000000 ffffffff 00000000
[  101.303640] $ 8   : 00000000 00000000 0087a000 00000002
[  101.308868] $12   : fffffffd 00000402 8633e288 00000040
[  101.314089] $16   : 00000000 808f3bf8 8635c320 00000000
[  101.319302] $20   : 8633e1c8 8635c320 847c0000 00000000
[  101.324533] $24   : 00000000 80061be4
[  101.329752] $28   : 87050000 87051c18 847c0000 8015f714
[  101.334978] Hi    : 00000000
[  101.337832] Lo    : 00101000
[  101.340753] epc   : 8015e0cc BUG+0x0/0x4
[  101.344644] ra    : 8015f714 kmem_cache_free_bulk+0x0/0x5c0
[  101.350202] Status: 1100c303 KERNEL EXL IE
[  101.354384] Cause : 10800424 (ExcCode 09)
[  101.358391] PrId  : 00019556 (MIPS 34Kc)
[  101.362292] Modules linked in: ksmbd ath9k ath9k_common ath9k_hw ath10k_pci ath10k_core ath pppoe nf_conntrack_netlink mac80211 iptable_nat ipt_REJECT cfg80211 ath9k_pci_owl_loader xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_TCPMSS xt_REDIRECT xt_MASQUERADE xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY wireguard sch_cake pppox ppp_async nfnetlink nf_reject_ipv4 nf_nat nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_conntrack_rtcache nf_conncount ltq_deu_vr9 iptable_raw iptable_mangle iptable_filter ipt_ECN ip_tables crc_ccitt compat sch_teql sch_sfq sch_red sch_prio sch_pie sch_multiq sch_gred sch_fq sch_dsmark sch_codel em_text em_nbyte em_meta em_cmp act_simple act_police act_pedit act_ipt act_gact act_csum libcrc32c act_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 sch_tbf sch_ingress sch_htb sch_hfsc em_u32
[  101.362584]  cls_u32 cls_tcindex cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred drivetemp drv_dsl_cpe_api drv_mei_cpe nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 pppoatm ppp_generic slhc ifb ip6_udp_tunnel udp_tunnel tun dm_mirror dm_region_hash dm_log dm_crypt dm_mod dax br2684 atm nls_utf8 drv_ifxos sha512_generic sha256_generic libsha256 sha1_generic seqiv jitterentropy_rng drbg md5 md4 hmac ghash_generic gf128mul gcm ecb des_generic libdes cts ctr cmac ccm cbc authenc arc4 usb_storage dwc2 sd_mod scsi_mod gpio_button_hotplug f2fs ext4 mbcache jbd2 crc32c_generic crc32_generic
[  101.508731] Process kworker/1:3 (pid: 528, threadinfo=4db3e383, task=b550e04b, tls=00000000)
[  101.517143] Stack : 00000058 00000000 8633e180 8635c320 838e0000 84788134 8633e1c8 00000000
[  101.525494]         838e0000 8635c320 838e0000 847a8794 0000011f 00000000 00000000 00000000
[  101.533850]         00000201 00002f4c 00000000 00000000 8633e580 00000000 85bf8a00 00000000
[  101.542206]         00000201 80766b80 807461ac 87051d64 00052dc0 00000001 0000010c 000000f4
[  101.550562]         00000020 ffffffe0 80750000 00000001 807667b4 00000000 00000000 8069410c
[  101.558917]         ...
[  101.561351] Call Trace:
[  101.563842] [<8015e0cc>] BUG+0x0/0x4
[  101.567385] [<8015f714>] kmem_cache_free_bulk+0x0/0x5c0
[  101.572605] Code: 00000000  1000ffda  24021000 <000c000d> 00a01025  8c8600d4  3c058069  24a50410  0816ab2e
[  101.582345]
[  101.584057] ---[ end trace c096d1bc803110cb ]---
[  101.588488] Kernel panic - not syncing: Fatal exception
[  101.593693] Rebooting in 3 seconds..

This looks similar (also happening inside the SLUB allocator) but not idential to #426

ksmbd is running on a BT Home Hub 5A (Big Endian MIPS, lantiq xrx200 target in OpenWrt). I'm using an USB hard disk with ext4 as filesystem.

The client is a PlayStation 2.

I made two tcpdump captures and the last captured packet in both is a TRANS2 request, one with FIND_NEXT and the other with QUERY_PATH_INFO.

@namjaejeon I'm happy for any ideas how to debug this? the crash dump doesn't contain references to the ksmbd code

namjaejeon commented 3 years ago

@xdarklight What kernel version do you use with ksmbd ?

xdarklight commented 3 years ago

5.4.66 plus OpenWrt's kernel patches

namjaejeon commented 3 years ago

@xdarklight Could you please reproduce the issue with the below change ?

diff --git a/smb1pdu.c b/smb1pdu.c
index 849bb17..226209d 100644
--- a/smb1pdu.c
+++ b/smb1pdu.c
@@ -6381,7 +6381,6 @@ err_out:
                rsp->hdr.Status.CifsError =
                        STATUS_UNEXPECTED_IO_ERROR;

-       ksmbd_fd_put(work, dir_fp);
        kfree(d_info.smb1_name);
        kfree(pathname);
        return 0;
xdarklight commented 3 years ago

it doesn't fix it but you are pointing me in the right direction: the problem is actually d_info right below. I'll open a pull request to fix this

namjaejeon commented 3 years ago

@xdarklight Applied. Thanks for your patch!

xdarklight commented 3 years ago

thank you for all the hard work your doing on ksmbd @namjaejeon as well as for being so quick with looking at my changes and giving feedback!

namjaejeon commented 3 years ago

@xdarklight thank you so much for your contribution!