utsaslab / crashmonkey

CrashMonkey: tools for testing file-system reliability (OSDI 18)
Apache License 2.0
193 stars 31 forks source link

insmod ERROR: “disk_wrapper.ko: Cannot allocate memory” #109

Closed junghan0611 closed 6 years ago

junghan0611 commented 6 years ago

I'm trying to test Crashmonkey, but I get an error when trying to register a kernel module. I also change the kernel version on some other PCs, but it shows same results. Please check this problem.


root@junghan-nuc:~/crashmonkey/build/c_harness -f /dev/vda1 -d /dev/cow_ram0 -t ext2 tests/rename_root_to_sub.so -v

running 0x7ffc10bb66f8 ========== PHASE 0: Setting up CrashMonkey basics ========== Inserting RAM disk module Loading test case Loading permuter Updating dirty_expire_time_centisecs to 3000

========== PHASE 1: Creating base disk image ========== Formatting test drive mke2fs 1.42.13 (17-May-2015) Discarding device blocks: done
Creating filesystem with 10240 1k blocks and 2560 inodes Filesystem UUID: 0637732c-1f79-4ed0-84ca-125bca2fb70a Superblock backups stored on blocks: 8193

Allocating group tables: done
Writing inode tables: done
Writing superblocks and filesystem accounting information: done

Mounting test file system for pre-test setup Running pre-test setup Unmounting test file system after pre-test setup Making new snapshot cloning device /dev/cow_ram0

========== PHASE 2: Recording user workload ========== Clearing caches Inserting wrapper module into kernel insmod: ERROR: could not insert module ../build/disk_wrapper.ko: Cannot allocate memory Error inserting kernel wrapper module rmmod: ERROR: Module cow_brd is in use Unable to remove cow_brd device root@junghan-nuc:~/crashmonkey/build#

ashmrtn commented 6 years ago

Hi junghan0611, is there any extra information besides this present in dmesg? The debug prefix that the disk_wrapper uses in dmesg is 'hwm', so doing dmesg | grep hwm will hopefully give more hints as to the issue.

This could be a problem where the system cannot get a flag device as well (the errors that you can return in the kernel are limited, so they may not match exactly). Can you confirm the /dev/vda1 exists if you do ls /dev? If there is a /dev/vda you may want to just use that as I usually pass an entire block device as a flag device. No data will be written to this device, it is just to copy some queue configuration information from, so you don't have to worry about losing data on it.

junghan0611 commented 6 years ago

Here is debug messages.


1.. run crashmonkey
root@junghan-nuc:~/crashmonkey_git/build# ./c_harness -f /dev/vda -d /dev/cow_ram0 -t ext2 tests/rename_root_to_sub.so -v
running 0x7ffd6139fa78
========== PHASE 0: Setting up CrashMonkey basics ==========
Inserting RAM disk module
Loading test case
Loading permuter
Updating dirty_expire_time_centisecs to 3000

========== PHASE 1: Creating base disk image ==========
Formatting test drive
mke2fs 1.42.13 (17-May-2015)
Discarding device blocks: done                            
Creating filesystem with 10240 1k blocks and 2560 inodes
Filesystem UUID: 10a1c4d7-7a80-4c60-9d6f-10f96cdad313
Superblock backups stored on blocks: 
    8193

Allocating group tables: done                            
Writing inode tables: done                            
Writing superblocks and filesystem accounting information: done

Mounting test file system for pre-test setup
Running pre-test setup
Unmounting test file system after pre-test setup
Making new snapshot
cloning device /dev/cow_ram0

========== PHASE 2: Recording user workload ==========
Clearing caches
Inserting wrapper module into kernel
insmod: ERROR: could not insert module ../build/disk_wrapper.ko: Cannot allocate memory
Error inserting kernel wrapper module
rmmod: ERROR: Module cow_brd is in use
Unable to remove cow_brd device

2.. dmesg
root@junghan-nuc:~/crashmonkey_git/build# tail /var/log/kern.log
Apr 10 22:34:52 junghan-nuc NetworkManager[860]: <info>  [1523367292.8068] WWAN hardware radio set enabled
Apr 10 22:35:35 junghan-nuc kernel: [   61.395767] cow_brd: loading out-of-tree module taints kernel.
Apr 10 22:35:35 junghan-nuc kernel: [   61.395791] cow_brd: module verification failed: signature and/or required key missing - tainting kernel
Apr 10 22:35:35 junghan-nuc kernel: [   61.396219] cow_brd: module loaded with 1 disks and 1 snapshots
Apr 10 22:35:35 junghan-nuc kernel: [   61.406761] EXT4-fs (cow_ram0): mounting ext2 file system using the ext4 subsystem
Apr 10 22:35:35 junghan-nuc kernel: [   61.406892] EXT4-fs (cow_ram0): mounted filesystem without journal. Opts: 
Apr 10 22:35:35 junghan-nuc kernel: [   61.477895] c_harness (2915): drop_caches: 3
Apr 10 22:35:35 junghan-nuc kernel: [   61.489192] hwm: Hello World from module
Apr 10 22:35:35 junghan-nuc kernel: [   61.489193] hwm: Wrapping device /dev/cow_ram_snapshot1_0 with flags device /dev/vda
Apr 10 22:35:35 junghan-nuc kernel: [   61.489200] hwm: unable to grab device to clone flags
root@junghan-nuc:~/crashmonkey_git/build# 

3.. check /dev/ ? 
root@junghan-nuc:~/crashmonkey_git/build# ls /dev/vda
ls: cannot access '/dev/vda': No such file or directory
root@junghan-nuc:~/crashmonkey_git/build# 

4.. try to rmmod cow_brd
root@junghan-nuc:~/crashmonkey_git/build# lsmod | grep cow_brd
cow_brd                16384  1
root@junghan-nuc:~/crashmonkey_git/build# 

root@junghan-nuc:~/crashmonkey_git/build# rmmod cow_brd
rmmod: ERROR: Module cow_brd is in use
root@junghan-nuc:~/crashmonkey_git/build# 

5.. retry after first run  
root@junghan-nuc:~/crashmonkey_git/build# ./c_harness -f /dev/vda -d /dev/cow_ram0 -t ext2 tests/rename_root_to_sub.so -v
running 0x7ffc7bd33598
========== PHASE 0: Setting up CrashMonkey basics ==========
Error starting socket to listen on 98

root@junghan-nuc:~/crashmonkey_git/build# ./c_harness -f /dev/vda -d /dev/cow_ram0 -t ext2 tests/rename_root_to_sub.so -v
running 0x7fff35df1e78
========== PHASE 0: Setting up CrashMonkey basics ==========
Inserting RAM disk module
insmod: ERROR: could not insert module ../build/cow_brd.ko: File exists
Error inserting RAM disk module
root@junghan-nuc:~/crashmonkey_git/build# 
ashmrtn commented 6 years ago

It looks like there's no block device at /dev/vda that CrashMonkey can get the queue flags from. You need to pass a valid block device to CrashMonkey so that it can open that block device in the kernel and copy the queue flags. The easiest thing to do is to pass it the boot device for the machine. This is usually either /dev/vda/ or /dev/sda depending on whether it is a virtual machine with VirtIO disk drivers or not. To check which block devices you actually have on your machine, you can run ls /dev and look for entries like vd* or sd*. If you want to pass in the root device (this only works if you are not running from a live disk or thumbdrive as then the root device would be a ramdisk) you can do cat /etc/mtab and look for an entry like

/dev/<device name>  /  <fs type>...

Unfortunately there still appears to be a small bug in CrashMonkey where if it fails to get the flag device you will not be able to remove the kernel module. In this case, you should just restart the machine as that will result in the kernel module being removed.

For the Error starting socket to listen on 98 message, that is just because CrashMonkey didn't exit via a code path that did all the usual system cleanup and there is still a socket file at /tmp/crash_monkey_harness. You can either sudo rm this file, or just run CrashMonkey a second time so that it fails with the Error starting socket to listen on 98 message and removes the socket itself.