derekmolloy / exploringBB

Source code for the book Exploring BeagleBone, by Derek Molloy (see www.exploringbeaglebone.com)
GNU General Public License v3.0
467 stars 444 forks source link

[fatal bug in driver]...exploringBB/extras/kernel/ebbchar/ebbchar.ko crashes kernel 5.0.5 #23

Open ZoranStojsavljevic opened 5 years ago

ZoranStojsavljevic commented 5 years ago

I did everything as advised. Did git clone https://github.com/derekmolloy/exploringBB.git And then, just before diving into the meat of the reference kernel driver, I wrote several versions of the script, the latest shown here: [vuser@fedora29-ssd ebbchar]$ ls -al total 784 drwxr-xr-x. 3 vuser vboxusers 4096 Apr 3 17:55 . drwxr-xr-x. 9 vuser vboxusers 4096 Apr 3 14:52 .. -rw-r--r--. 1 vuser vboxusers 91 Apr 3 14:52 99-ebbchar.rules -rw-r--r--. 1 vuser vboxusers 8629 Apr 3 14:52 ebbchar.c -rw-r--r--. 1 vuser vboxusers 319816 Apr 3 16:56 ebbchar.ko -rw-r--r--. 1 vuser vboxusers 431 Apr 3 16:56 .ebbchar.ko.cmd -rw-r--r--. 1 vuser vboxusers 646 Apr 3 16:56 ebbchar.mod.c -rw-r--r--. 1 vuser vboxusers 104680 Apr 3 16:56 ebbchar.mod.o -rw-r--r--. 1 vuser vboxusers 30709 Apr 3 16:56 .ebbchar.mod.o.cmd -rw-r--r--. 1 vuser vboxusers 216896 Apr 3 16:56 ebbchar.o -rw-r--r--. 1 vuser vboxusers 38017 Apr 3 16:56 .ebbchar.o.cmd -rw-r--r--. 1 vuser vboxusers 5 Apr 3 14:52 .gitignore -rwxr-xr-x. 1 vuser vboxusers 101 Apr 3 17:55 loop.sh -rwxr-xr-x. 1 vuser vboxusers 102 Apr 3 16:59 loop.sh~ -rw-r--r--. 1 vuser vboxusers 194 Apr 3 14:52 Makefile -rw-r--r--. 1 vuser vboxusers 87 Apr 3 16:56 modules.order -rw-r--r--. 1 vuser vboxusers 0 Apr 3 16:56 Module.symvers -rwxr-xr-x. 1 vuser vboxusers 18848 Apr 3 16:56 test -rw-r--r--. 1 vuser vboxusers 1787 Apr 3 14:52 testebbchar.c drwxr-xr-x. 2 vuser vboxusers 4096 Apr 3 16:56 .tmp_versions [vuser@fedora29-ssd ebbchar]$ pwd /home/vuser/projects2/C_coding/lkm/exploringBB/extras/kernel/ebbchar [vuser@fedora29-ssd ebbchar]$ cat loop.sh

!/bin/bash

for (( c=1; c<=5000; c++ )) do sudo insmod ebbchar.ko sudo rmmod -f ebbchar done [vuser@fedora29-ssd ebbchar]$

[vuser@fedora29-ssd ebbchar]$ ./loop.sh ./loop.sh: line 6: 5825 Segmentation fault sudo insmod ebbchar.ko rmmod: ERROR: could not remove 'ebbchar': Device or resource busy rmmod: ERROR: could not remove module ebbchar: Device or resource busy ^C [vuser@fedora29-ssd ebbchar]$


After about 100 cycles, the kernel ooooops. And the whole system began to be unresponsive. Browser does not work... I'll try to capture dmesg and post here:

[ 4131.382433] EBBChar: Initializing the EBBChar LKM [ 4131.382436] EBBChar: registered correctly with major number 511 [ 4131.382442] EBBChar: device class registered correctly [ 4131.382512] EBBChar: device class created correctly [ 4131.411171] EBBChar: Goodbye from the LKM! [ 4131.443761] EBBChar: Initializing the EBBChar LKM [ 4131.443765] EBBChar: registered correctly with major number 511 [ 4131.443775] EBBChar: device class registered correctly [ 4131.443841] EBBChar: device class created correctly [ 4131.471791] EBBChar: Goodbye from the LKM! [ 4131.502473] EBBChar: Initializing the EBBChar LKM [ 4131.502476] EBBChar: registered correctly with major number 511 [ 4131.502483] ------------[ cut here ]------------ [ 4131.502484] kernel BUG at mm/slub.c:305! [ 4131.502489] invalid opcode: 0000 [#1] SMP PTI [ 4131.502492] CPU: 3 PID: 5827 Comm: insmod Tainted: G OE 5.0.5-200.fc29.x86_64 #1 [ 4131.502493] Hardware name: Hewlett-Packard HP EliteBook 840 G1/198F, BIOS L71 Ver. 01.46 07/20/2018 [ 4131.502498] RIP: 0010:slab_free+0x1e2/0x3d0 [ 4131.502499] Code: 8b 54 24 30 48 89 4c 24 28 e8 da fb ff ff 4c 8b 54 24 28 85 c0 0f 85 67 fe ff ff 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 49 3b 5c 24 28 75 ab 48 8b 44 24 30 49 89 4c 24 28 49 89 44 [ 4131.502501] RSP: 0018:ffffbe5cc1d83b20 EFLAGS: 00010246 [ 4131.502503] RAX: ffffa0014079bf60 RBX: 00000000802a0029 RCX: ffffa0014079bf60 [ 4131.502504] RDX: ffffa0014079bf60 RSI: ffffef984d01e6c0 RDI: ffffa00142403680 [ 4131.502505] RBP: ffffbe5cc1d83bc8 R08: 0000000000000001 R09: ffffffff82957905 [ 4131.502506] R10: ffffa0014079bf60 R11: 0000000000000001 R12: ffffef984d01e6c0 [ 4131.502508] R13: ffffa0014079bf60 R14: ffffa00142403680 R15: ffffef984d01e6e0 [ 4131.502509] FS: 00007f558223b280(0000) GS:ffffa001428c0000(0000) knlGS:0000000000000000 [ 4131.502511] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4131.502512] CR2: 000055ef4b528e98 CR3: 00000002187c6006 CR4: 00000000001606e0 [ 4131.502513] Call Trace: [ 4131.502519] ? log_store+0x200/0x280 [ 4131.502522] ? kmalloc_track_caller+0x170/0x210 [ 4131.502526] ? kobject_set_name_vargs+0x1e/0x90 [ 4131.502529] kobject_put+0x85/0x1a0 [ 4131.502532] kobject_add_internal+0xcc/0x270 [ 4131.502535] kset_register+0x41/0x60 [ 4131.502539] class_register+0xcc/0x190 [ 4131.502541] class_create+0x4c/0x80 [ 4131.502543] ? 0xffffffffc118b000 [ 4131.502546] ebbchar_init+0x72/0x1000 [ebbchar] [ 4131.502550] do_one_initcall+0x46/0x1c3 [ 4131.502554] ? free_unref_page_commit+0x9b/0x110 [ 4131.502558] ? _cond_resched+0x15/0x30 [ 4131.502560] ? kmem_cache_alloc_trace+0x154/0x1d0 [ 4131.502563] do_init_module+0x5a/0x210 [ 4131.502565] load_module+0x2096/0x22d0 [ 4131.502569] ? ima_post_read_file+0xf4/0x100 [ 4131.502572] ? do_sys_finit_module+0xa8/0x110 [ 4131.502574] do_sys_finit_module+0xa8/0x110 [ 4131.502577] do_syscall_64+0x5b/0x160 [ 4131.502580] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 4131.502582] RIP: 0033:0x7f5582355efd [ 4131.502583] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5b 7f 0c 00 f7 d8 64 89 01 48 [ 4131.502584] RSP: 002b:00007ffde2369ae8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 4131.502586] RAX: ffffffffffffffda RBX: 000055ef4b5257b0 RCX: 00007f5582355efd [ 4131.502587] RDX: 0000000000000000 RSI: 000055ef49872346 RDI: 0000000000000003 [ 4131.502588] RBP: 000055ef49872346 R08: 0000000000000000 R09: 00007f5582421020 [ 4131.502590] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 [ 4131.502591] R13: 000055ef4b527e90 R14: 0000000000000000 R15: 0000000000000000 [ 4131.502592] Modules linked in: ebbchar(OE+) fuse ccm xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc devlink nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat_ipv4 nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables vboxpci(OE) vboxnetadp(OE) ip6table_filter vboxnetflt(OE) ip6_tables vboxdrv(OE) sunrpc vfat fat arc4 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 iwlmvm videobuf2_common mei_wdt iTCO_wdt crct10dif_pclmul iTCO_vendor_support crc32_pclmul videodev mac80211 media ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf iwlwifi snd_hda_codec_idt snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_hda_codec cfg80211 snd_hda_core [ 4131.502617] snd_hwdep snd_seq joydev snd_seq_device hp_wmi snd_pcm sparse_keymap wmi_bmof i2c_i801 rtsx_pci_ms lpc_ich memstick snd_timer rfkill mei_me snd mei soundcore hp_accel lis3lv02d input_polldev pcc_cpufreq hp_wireless tpm_infineon binfmt_misc amdgpu chash amd_iommu_v2 gpu_sched i915 radeon rtsx_pci_sdmmc mmc_core kvmgt mdev vfio kvm irqbypass i2c_algo_bit drm_kms_helper ttm crc32c_intel drm serio_raw rtsx_pci e1000e wmi video [last unloaded: ebbchar] [ 4131.502636] ---[ end trace ca5576a9f61c90ba ]--- [ 4131.502638] RIP: 0010:__slab_free+0x1e2/0x3d0 [ 4131.502640] Code: 8b 54 24 30 48 89 4c 24 28 e8 da fb ff ff 4c 8b 54 24 28 85 c0 0f 85 67 fe ff ff 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 49 3b 5c 24 28 75 ab 48 8b 44 24 30 49 89 4c 24 28 49 89 44 [ 4131.502641] RSP: 0018:ffffbe5cc1d83b20 EFLAGS: 00010246 [ 4131.502642] RAX: ffffa0014079bf60 RBX: 00000000802a0029 RCX: ffffa0014079bf60 [ 4131.502644] RDX: ffffa0014079bf60 RSI: ffffef984d01e6c0 RDI: ffffa00142403680 [ 4131.502645] RBP: ffffbe5cc1d83bc8 R08: 0000000000000001 R09: ffffffff82957905 [ 4131.502646] R10: ffffa0014079bf60 R11: 0000000000000001 R12: ffffef984d01e6c0 [ 4131.502647] R13: ffffa0014079bf60 R14: ffffa00142403680 R15: ffffef984d01e6e0 [ 4131.502649] FS: 00007f558223b280(0000) GS:ffffa001428c0000(0000) knlGS:0000000000000000 [ 4131.502650] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4131.502651] CR2: 000055ef4b528e98 CR3: 00000002187c6006 CR4: 00000000001606e0

Thank you, Zoran

ZoranStojsavljevic commented 5 years ago

I also tried this on my Debian Buster VM (not only on my host Fedora 29). Here are the results:

user@unassigned-hostname:~/ebbchar$ ls -al
total 36
drwxr-xr-x  2 user user 4096 Apr  4 15:38 .
drwxr-xr-x 17 user user 4096 Apr  4 15:08 ..
-rw-r--r--  1 user user   91 Apr  4 15:40 99-ebbchar.rules
-rw-r--r--  1 user user 8745 Apr  4 15:40 ebbchar.c
-rwxr-xr-x  1 user user  113 Apr  4 15:40 loop.sh
-rw-r--r--  1 user user  194 Apr  4 15:40 Makefile
-rw-r--r--  1 user user 1787 Apr  4 15:40 testebbchar.c
user@unassigned-hostname:~/ebbchar$ make all
make -C /lib/modules/4.19.0-4-amd64/build/ M=/home/user/ebbchar modules
make[1]: Entering directory '/usr/src/linux-headers-4.19.0-4-amd64'
  CC [M]  /home/user/ebbchar/ebbchar.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /home/user/ebbchar/ebbchar.mod.o
  LD [M]  /home/user/ebbchar/ebbchar.ko
make[1]: Leaving directory '/usr/src/linux-headers-4.19.0-4-amd64'
cc testebbchar.c -o test
user@unassigned-hostname:~/ebbchar$ pwd
/home/user/ebbchar
user@unassigned-hostname:~/ebbchar$ su -
Password: 
root@unassigned-hostname:~# cd /home/user/ebbchar

root@unassigned-hostname:/home/user/ebbchar# insmod ebbchar.ko

[ 2175.653263] ebbchar: loading out-of-tree module  taints kernel.
[ 2175.653333] ebbchar: module verification failed: signature and/or required key missing - tainting kernel
[ 2175.655296] EBBChar: Initializing the EBBChar LKM
[ 2175.655305] EBBChar: registered correctly with major number 245
[ 2175.655345] EBBChar: device class registered correctly
[ 2175.658055] EBBChar: device class created correctly

root@unassigned-hostname:/home/user/ebbchar# rmmod ebbchar.ko And Oooooops happened here:

[ 3540.895649] ------------[ cut here ]------------
[ 3540.895653] refcount_t: underflow; use-after-free.
[ 3540.895728] WARNING: CPU: 1 PID: 3626 at lib/refcount.c:187 refcount_sub_and_test_checked+0x58/0x70
[ 3540.895730] Modules linked in: ebbchar(OE-) nls_ascii nls_cp437 vfat fat crct10dif_pclmul crc32_pclmul efi_pstore joydev ghash_clmulni_intel intel_rapl_perf serio_raw efivars pcspkr snd_intel8x0 vboxvideo(C) ttm snd_ac97_codec ac97_bus drm_kms_helper sg snd_pcm battery snd_timer drm snd soundcore evdev vboxguest ac efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb sd_mod sr_mod cdrom crc32c_intel ohci_pci ahci libahci libata ehci_pci ohci_hcd ehci_hcd aesni_intel aes_x86_64 crypto_simd cryptd glue_helper scsi_mod psmouse usbcore i2c_piix4 usb_common e1000 video button
[ 3540.895853] CPU: 1 PID: 3626 Comm: rmmod Tainted: G         C OE     4.19.0-4-amd64 #1 Debian 4.19.28-2
[ 3540.895855] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 3540.895861] RIP: 0010:refcount_sub_and_test_checked+0x58/0x70
[ 3540.895865] Code: dc 31 c0 48 83 c4 08 c3 0f b6 05 99 23 d3 00 84 c0 75 ee 48 c7 c7 f8 cf a8 b4 88 44 24 07 c6 05 83 23 d3 00 01 e8 c2 36 cb ff <0f> 0b 0f b6 44 24 07 eb c4 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00
[ 3540.895868] RSP: 0018:ffffabcc4415fea0 EFLAGS: 00010286
[ 3540.895873] RAX: 0000000000000000 RBX: ffff934fd3036a18 RCX: 0000000000000006
[ 3540.895876] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff934fdbb166a0
[ 3540.895878] RBP: 0000000000000000 R08: 000000000000023b R09: 0000000000000004
[ 3540.895881] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000800
[ 3540.895884] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 3540.895889] FS:  00007f0c2bfa6200(0000) GS:ffff934fdbb00000(0000) knlGS:0000000000000000
[ 3540.895892] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3540.895895] CR2: 0000555b4feecd19 CR3: 000000011af98004 CR4: 00000000000606e0
[ 3540.895904] Call Trace:
[ 3540.895952]  kobject_put+0x23/0x1b0
[ 3540.895980]  ebbchar_exit+0x2d/0xea3 [ebbchar]
[ 3540.895996]  __x64_sys_delete_module+0x190/0x2c0
[ 3540.896022]  do_syscall_64+0x53/0x100
[ 3540.896039]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3540.896044] RIP: 0033:0x7f0c2c0c7137
[ 3540.896049] Code: 73 01 c3 48 8b 0d 59 0d 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 29 0d 0c 00 f7 d8 64 89 01 48
[ 3540.896052] RSP: 002b:00007ffcc14eeba8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 3540.896058] RAX: ffffffffffffffda RBX: 000056443fa22790 RCX: 00007f0c2c0c7137
[ 3540.896061] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000056443fa227f8
[ 3540.896064] RBP: 0000000000000000 R08: 00007ffcc14edb21 R09: 0000000000000000
[ 3540.896066] R10: 00007f0c2c138ae0 R11: 0000000000000206 R12: 00007ffcc14eedd0
[ 3540.896069] R13: 00007ffcc14efef2 R14: 000056443fa22260 R15: 000056443fa22790
[ 3540.896074] ---[ end trace b7a015b20a0892a9 ]---
[ 3540.896977] EBBChar: Goodbye from the LKM!

Then, it started working correctly (few times inserted by CLI):

[ 3617.680208] EBBChar: Initializing the EBBChar LKM
[ 3617.680214] EBBChar: registered correctly with major number 245
[ 3617.680233] EBBChar: device class registered correctly
[ 3617.681609] EBBChar: device class created correctly
[ 3628.456320] EBBChar: Goodbye from the LKM!
[ 3643.261899] EBBChar: Initializing the EBBChar LKM
[ 3643.261903] EBBChar: registered correctly with major number 245
[ 3643.261917] EBBChar: device class registered correctly
[ 3643.262768] EBBChar: device class created correctly
[ 3646.017039] EBBChar: Goodbye from the LKM!
root@unassigned-hostname:/home/user/ebbchar#

While running the script loop.sh: root@unassigned-hostname:/home/user/ebbchar# cat loop.sh

#!/bin/bash
for (( c=1; c<=5000; c++ ))
do
    sudo insmod ebbchar.ko
    echo $c
    sudo rmmod ebbchar
done

The following happens:

82
83
84
85
86
87
insmod: ERROR: could not insert module ebbchar.ko: Unknown symbol in module
88
rmmod: ERROR: Module ebbchar is not currently loaded
insmod: ERROR: could not insert module ebbchar.ko: Unknown symbol in module
89
rmmod: ERROR: Module ebbchar is not currently loaded
insmod: ERROR: could not insert module ebbchar.ko: Unknown symbol in module
90
rmmod: ERROR: Module ebbchar is not currently loaded
insmod: ERROR: could not insert module ebbchar.ko: Unknown symbol in module
91

Seems that some device drivers' function calls changed from 2015. to 2019. ... Or?

Thank you, Zoran

derekmolloy commented 5 years ago

Hi Zoran,

Thanks for your great work on this. I haven't looked at this code in quite some time and I haven't had the time to keep it up to date with kernel releases. It was written as an educational piece rather than a robust module, so I didn't spend much time on debugging or analysing it. That being said, if you can identify what is going wrong then I would be very happy to modify it.

Kind regards, Derek.

On Thu, Apr 4, 2019 at 5:17 PM Zoran Stojsavljevic notifications@github.com wrote:

I also tried this on my Debian Buster VM (not only on my host Fedora 29). Here are the results: user@unassigned-hostname:/ebbchar$ ls -al total 36 drwxr-xr-x 2 user user 4096 Apr 4 15:38 . drwxr-xr-x 17 user user 4096 Apr 4 15:08 .. -rw-r--r-- 1 user user 91 Apr 4 15:40 99-ebbchar.rules -rw-r--r-- 1 user user 8745 Apr 4 15:40 ebbchar.c -rwxr-xr-x 1 user user 113 Apr 4 15:40 loop.sh -rw-r--r-- 1 user user 194 Apr 4 15:40 Makefile -rw-r--r-- 1 user user 1787 Apr 4 15:40 testebbchar.c user@unassigned-hostname:/ebbchar$ make all make -C /lib/modules/4.19.0-4-amd64/build/ M=/home/user/ebbchar modules make[1]: Entering directory '/usr/src/linux-headers-4.19.0-4-amd64' CC [M] /home/user/ebbchar/ebbchar.o Building modules, stage 2. MODPOST 1 modules CC /home/user/ebbchar/ebbchar.mod.o LD [M] /home/user/ebbchar/ebbchar.ko make[1]: Leaving directory '/usr/src/linux-headers-4.19.0-4-amd64' cc testebbchar.c -o test user@unassigned-hostname:/ebbchar$ pwd /home/user/ebbchar user@unassigned-hostname:/ebbchar$ su - Password: root@unassigned-hostname:~# cd /home/user/ebbchar root@unassigned-hostname:/home/user/ebbchar# insmod ebbchar.ko

[ 2175.653263] ebbchar: loading out-of-tree module taints kernel. > [ 2175.653333] ebbchar: module verification failed: signature and/or required key missing - tainting kernel [ 2175.655296] EBBChar: Initializing the EBBChar LKM [ 2175.655305] EBBChar: registered correctly with major number 245 [ 2175.655345] EBBChar: device class registered correctly [ 2175.658055] EBBChar: device class created correctly

root@unassigned-hostname:/home/user/ebbchar# rmmod ebbchar.ko And Oooooops hapenned here:

[ 3540.895649] ------------[ cut here ]------------ [ 3540.895653] refcount_t: underflow; use-after-free. [ 3540.895728] WARNING: CPU: 1 PID: 3626 at lib/refcount.c:187 refcount_sub_and_test_checked+0x58/0x70 [ 3540.895730] Modules linked in: ebbchar(OE-) nls_ascii nls_cp437 vfat fat crct10dif_pclmul crc32_pclmul efi_pstore joydev ghash_clmulni_intel intel_rapl_perf serio_raw efivars pcspkr snd_intel8x0 vboxvideo(C) ttm snd_ac97_codec ac97_bus drm_kms_helper sg snd_pcm battery snd_timer drm snd soundcore evdev vboxguest ac efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb sd_mod sr_mod cdrom crc32c_intel ohci_pci ahci libahci libata ehci_pci ohci_hcd ehci_hcd aesni_intel aes_x86_64 crypto_simd cryptd glue_helper scsi_mod psmouse usbcore i2c_piix4 usb_common e1000 video button [ 3540.895853] CPU: 1 PID: 3626 Comm: rmmod Tainted: G C OE 4.19.0-4-amd64

1 https://github.com/derekmolloy/exploringBB/issues/1 Debian 4.19.28-2

[ 3540.895855] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 3540.895861] RIP: 0010:refcount_sub_and_test_checked+0x58/0x70 [ 3540.895865] Code: dc 31 c0 48 83 c4 08 c3 0f b6 05 99 23 d3 00 84 c0 75 ee 48 c7 c7 f8 cf a8 b4 88 44 24 07 c6 05 83 23 d3 00 01 e8 c2 36 cb ff

<0f> 0b 0f b6 44 24 07 eb c4 31 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 [ 3540.895868] RSP: 0018:ffffabcc4415fea0 EFLAGS: 00010286 [ 3540.895873] RAX: 0000000000000000 RBX: ffff934fd3036a18 RCX: 0000000000000006 [ 3540.895876] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff934fdbb166a0 [ 3540.895878] RBP: 0000000000000000 R08: 000000000000023b R09: 0000000000000004 [ 3540.895881] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000800 [ 3540.895884] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 3540.895889] FS: 00007f0c2bfa6200(0000) GS:ffff934fdbb00000(0000) knlGS:0000000000000000 [ 3540.895892] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3540.895895] CR2: 0000555b4feecd19 CR3: 000000011af98004 CR4: 00000000000606e0 [ 3540.895904] Call Trace: [ 3540.895952] kobject_put+0x23/0x1b0 [ 3540.895980] ebbchar_exit+0x2d/0xea3 [ebbchar] [ 3540.895996] __x64_sys_delete_module+0x190/0x2c0 [ 3540.896022] do_syscall_64+0x53/0x100 [ 3540.896039] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 3540.896044] RIP: 0033:0x7f0c2c0c7137 [ 3540.896049] Code: 73 01 c3 48 8b 0d 59 0d 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 29 0d 0c 00 f7 d8 64 89 01 48 [ 3540.896052] RSP: 002b:00007ffcc14eeba8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 3540.896058] RAX: ffffffffffffffda RBX: 000056443fa22790 RCX: 00007f0c2c0c7137 [ 3540.896061] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000056443fa227f8 [ 3540.896064] RBP: 0000000000000000 R08: 00007ffcc14edb21 R09: 0000000000000000 [ 3540.896066] R10: 00007f0c2c138ae0 R11: 0000000000000206 R12: 00007ffcc14eedd0 [ 3540.896069] R13: 00007ffcc14efef2 R14: 000056443fa22260 R15: 000056443fa22790 [ 3540.896074] ---[ end trace b7a015b20a0892a9 ]--- [ 3540.896977] EBBChar: Goodbye from the LKM! Then, it started working correctly (few times inserted by CLI): [ 3617.680208] EBBChar: Initializing the EBBChar LKM [ 3617.680214] EBBChar: registered correctly with major number 245 [ 3617.680233] EBBChar: device class registered correctly [ 3617.681609] EBBChar: device class created correctly [ 3628.456320] EBBChar: Goodbye from the LKM! [ 3643.261899] EBBChar: Initializing the EBBChar LKM [ 3643.261903] EBBChar: registered correctly with major number 245 [ 3643.261917] EBBChar: device class registered correctly [ 3643.262768] EBBChar: device class created correctly [ 3646.017039] EBBChar: Goodbye from the LKM! root@unassigned-hostname:/home/user/ebbchar# While running the script loop.sh: root@unassigned-hostname:/home/user/ebbchar# cat loop.sh #!/bin/bash for (( c=1; c<=5000; c++ )) do sudo insmod ebbchar.ko echo $c sudo rmmod ebbchar done The following happens: 82 83 84 85 86 87 insmod: ERROR: could not insert module ebbchar.ko: Unknown symbol in module 88 rmmod: ERROR: Module ebbchar is not currently loaded insmod: ERROR: could not insert module ebbchar.ko: Unknown symbol in module 89 rmmod: ERROR: Module ebbchar is not currently loaded insmod: ERROR: could not insert module ebbchar.ko: Unknown symbol in module 90 rmmod: ERROR: Module ebbchar is not currently loaded insmod: ERROR: could not insert module ebbchar.ko: Unknown symbol in module 91 Seems that some device drivers' function calls changed from 2015. to 2019. ... Or? Thank you, Zoran — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread .
ZoranStojsavljevic commented 5 years ago

Hello Derek,

Just to be on the safe side, and check the base (kernel itself and module framework), i did two basic tests:

Test #1: I did cut/paste very simplistic driver Hello World with only init and exit functions, to be sure about kernel module framework itself. Here is the minimalistic driver's C code for the test:

/*  
 *  hello_test.c - Demonstrating the module_init() and module_exit() macros.
 *  This is preferred over using init_module() and cleanup_module().
 */
#include <linux/module.h>   /* Needed by all modules */
#include <linux/kernel.h>   /* Needed for KERN_INFO */
#include <linux/init.h>     /* Needed for the macros */

MODULE_LICENSE("GPL");                             ///< The license type -- this affects available functionality
MODULE_DESCRIPTION("A simple Linux test driver");  ///< The description -- see modinfo
MODULE_VERSION("0.01");                            ///< A version number to inform users

static int __init hello_test_init(void) {
    printk(KERN_INFO "Hello_Test, world\n");
    return 0;
}

static void __exit hello_test_exit(void) {
    printk(KERN_INFO "Goodbye_Test, world\n");
}

With essentially the same script:

#!/bin/bash
for (( c=1; c<=5000; c++ ))
do
    sudo insmod hello_test.ko
    echo $c
    sudo rmmod hello_test
done

I have tested it on both: Linux fedora29-ssd 5.0.5-200.fc29.x86_64 #1 SMP Wed Mar 27 20:58:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Linux unassigned-hostname 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15) x86_64 GNU/Linux Does work without any problems on both Debian Buster VM and Fedora 29 host

Test #2: I have picked the very complex in-module-tree driver (can.ko), and wrote another script for test:

#!/bin/bash
for (( c=1; c<=2000; c++ ))
do
    sudo modprobe can
    echo $c
    sudo rmmod can
done

For both: Linux fedora29-ssd 5.0.5-200.fc29.x86_64 #1 SMP Wed Mar 27 20:58:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Linux unassigned-hostname 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15) x86_64 GNU/Linux Does work without any problems on both Debian Buster VM and Fedora 29 host


So, the conclusion is: yes, something is wrong with your reference driver.

I will try to debug your reference driver, but since I have lot of jobs on my plate, I might not do this immediately.

Please, all, stay tuned! :-)

Zoran

ZoranStojsavljevic commented 5 years ago

OK, Derek,

I got the quick solution idea... Since I used can.ko for complicated testing, I decided to glance through the code. Since I am using a true Linus vanilla kernel.org for such kinds of source code checking, I went there, and looked into the following: /home/vuser/projects/kernel-git/linux/net/can

And opened af-can.c to understand its _init and _exit functions. After reading it, I got much better idea, and found/spotted bug from the very first look into your _exit function:

/** @brief The LKM cleanup function
 *  Similar to the initialization function, it is static. The __exit macro notifies that if this
 *  code is used for a built-in driver (not a LKM) that this function is not required.
 */
static void __exit ebbchar_exit(void){
   device_destroy(ebbcharClass, MKDEV(majorNumber, 0));     // remove the device
   // class_unregister(ebbcharClass);                       // unregister the device class <<=== BUG ===
   class_destroy(ebbcharClass);                             // remove the device class
   unregister_chrdev(majorNumber, DEVICE_NAME);             // unregister the major number
   printk(KERN_INFO "EBBChar: Goodbye from the LKM!\n");
}

I have tested your driver several times on both: Linux fedora29-ssd 5.0.5-200.fc29.x86_64 #1 SMP Wed Mar 27 20:58:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux Linux unassigned-hostname 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15) x86_64 GNU/Linux

Using my loop.sh script, everything works seamlessly.

Now, if you allow me few comments to be introduced here! [1] WARNING: please, do note that this driver may have other bugs which are yet to be found! [2] This bug was (probably) introduced from the very beginning of this project! [3] Derek, you should create a new file in the root of this repo, which will contain fixed bugs (and also you need to change release version of the driver, after fixing this bug! Do note that 301 people cloned the initial release of your driver with this bug included!

Cheers, Zoran Stojsavljevic

zamnut commented 4 years ago

Hi Derek

I tested the ebbchar driver example with your test program and would consistently get a segfault. I traced the problem to this bit of code:

static ssize_t dev_write(struct file filep, const char buffer, size_t len, loff_t *offset){ sprintf(message, "%s(%zu letters)", buffer, len); // appending received string with its length size_of_message = strlen(message); // store the length of the stored message printk(KERN_INFO "EBBChar: Received %zu characters from the user\n", len); return len; }

specifically the sprint line.

I rewrote the function to be:

static ssize_t dev_write(struct file filep, const char __user buffer, size_t len, loff_t *offset){ //sprintf(message, "%s(%zu letters)", buffer, len); // appending received string with its length unsigned long resp; resp = copy_from_user(message, buffer, len); sprintf(message+len, "(%zu letters)",len); size_of_message = strlen(message); // store the length of the stored message printk(KERN_INFO "EBBChar: Received %zu characters from the user\n", len); return len; } The kernel module works now. I am new to c and kernel programming, so I am not totally clear why the sprintf function did not work in the original. Hope this helps.

Tom

zamnut commented 4 years ago

I see that I should have made the resp variable static. Ah well. Like I said I am new to kernel development.

TheUnlimited64 commented 1 year ago

Hi Derek

I tested the ebbchar driver example with your test program and would consistently get a segfault. I traced the problem to this bit of code:

static ssize_t dev_write(struct file filep, const char buffer, size_t len, loff_t *offset){ sprintf(message, "%s(%zu letters)", buffer, len); // appending received string with its length size_of_message = strlen(message); // store the length of the stored message printk(KERN_INFO "EBBChar: Received %zu characters from the user\n", len); return len; }

specifically the sprint line.

I rewrote the function to be:

static ssize_t dev_write(struct file filep, const char __user buffer, size_t len, loff_t *offset){ //sprintf(message, "%s(%zu letters)", buffer, len); // appending received string with its length unsigned long resp; resp = copy_from_user(message, buffer, len); sprintf(message+len, "(%zu letters)",len); size_of_message = strlen(message); // store the length of the stored message printk(KERN_INFO "EBBChar: Received %zu characters from the user\n", len); return len; } The kernel module works now. I am new to c and kernel programming, so I am not totally clear why the sprintf function did not work in the original. Hope this helps.

Tom

Its because if you for example send "a", then len will be 1. So buffer[0] is allowed, but buffer[1] is not allowed to be used, but buffer[1] would contain the \0, that sprintf relys on, so it gets an segfault due to accessing memory its not allowed to