report two Ace bugs - Githubissues

Hitatm commented 5 years ago

I am intersted at your crashmonkey，but i found some obvious bugs. maybe，you can fix it.

1 at copy_diff.sh:27

     if [ -e build/diff* ]  
when there are two diff* files match ; shell script will  exit incorrectly；
 suggest ： if [ -n "$(ls build/| grep diff)" ]

2 at ace.py variable file_names， incomplete type judgment

   if isinstance(file_names, basestring): 
       balabalabala
  else
     file_name = file_names[0]   #when file_names is None , it can not generate nested seq-1 ;

i have tried "python ace.py -l 1 -n True -d false",your ace will always report None Type Error.

Hitatm commented 5 years ago

the first bug ,when runnig your seq2 testcase for seconds ,then interrupt it, and rerun,you will find some testcase report "Could not run test".

ashmrtn commented 5 years ago

From the description of bug 1, it sounds like an artifact of how CrashMonkey exits when interrupted. My guess would be that CrashMonkey hasn't cleaned up some of the resources it uses, and so the second time you try and run the test it fails due to the resource issue. I know CrashMonkey does not exit cleanly when interrupted. The resources that cause it to fail will be cowBrd and DiskWrapper kernel modules already inserted in the kernel and a socket with CrashMonkey in the name in /tmp. Looking in \tmp for a socket from CrashMonkey and using lsmod to see if the cowBrd or DiskWrapper kernel modules are inserted after the test harness has been interrupted should be enough to confirm this is the issue.

As for bug 2, I'm not really familiar with the inner workings of ACE, so I'm not sure what the issue is off the top of my head

Hitatm commented 5 years ago

Thanks @ashmrtn for your detailed response. for bug2 i find I can use the script under dir crashmonkey/ace/specific_generator_scripts;
In addition，I notice that you have done a lot of work on cowBrd and DiskWrapper. I have read your paper (B3) in detail and try to apply your dirver to my andriod phone ,but my phone's linux kernel(google’s AOSP ,current version is 4.9 , next version is 4.14) does not match your driver . I'm not familiar with the block layer,i tried，but didn't make it work :( . could you please adapt your driver to 4.9 and 4.14 ? if you could help , i'll always appreciate it :)

ashmrtn commented 5 years ago

I can see if I have some extra time in the next few weeks to try porting CrashMonkey to 4.9. My guess would be it's some mismatch in the names of variables or the build process, but I don't know if it'll be a quick fix or not since I've never tried running it on an Android device.

So that I have a little more background information, how are you building CrashMonkey for Android? The Makefile currently in the repo uses Linux kernel headers to compile the kernel modules in the project, so I would assume something similar for Android would be required, though I've never actually tried it.

Could you also give me some more information about about exactly is not working so that I might have a better idea of what the problem may be (ex. build errors, some error while executing, etc)? Output from these errors would also be helpful, and maybe a quick link/guide/overview to how you are building stuff, if you don't mind, since I've never worked with Android kernels before :)

Hitatm commented 5 years ago

Sorry, @ashmrtn I did not make it clear. I'm trying to test my filesystem,under linux mainline on PC(qemu). Since android kernel is 4.9, I prepare to move my filesystem to linux 4.9 and do some crash consistent test. my kernel version 4.9.0-rc8+, no build errors. and i can correctly insmod cowBrd and DiskWrapper.

run the following cmds, there is no errors, 
sudo insmode cow_brd.ko;
sudo insmod disk_wrapper.ko target_device_path=/dev/cow_ram_snapshot1_0   flags_device_path=/dev/vdb
dmesg | tail 

[  580.753693] cow_brd: module unloaded
[  591.843479] cow_brd: module loaded with 1 disks and 1 snapshots
[  883.808620] hwm: Hello World from module
[  883.809523] hwm: Wrapping device /dev/cow_ram_snapshot1_0 with flags device /dev/vdb
[  883.811676] hwm: working with queue with:
[  883.811676]  flags 0xf02a00
[  883.814521] hwm: initialized

then i rmmod the two drivers and then run sudo python xfsMonkey.py -f /dev/vdb -d /dev/cow_ram0 -t f2fs -e 102400 -u build/tests/seq1/ it got stuck immediately,here is the test log and trace msg.

Recorded workload:
bio # time               sector             size
0     2679.261549        0                  0
    flags 0x8000000000000000: checkpoint
1     2679.299274        0x1000             0x3000
    flags 0x3070            : sync, meta, prio, flush, read ahead,
2     2679.318784        0x1                0
    flags 0x8000000000000000: checkpoint
========== PHASE 3: Running tests based on recorded data ==========
Writing data out to each Checkpoint and checking with fsck

I guess when replay workload the filesystem was destroyed,but i have no idea of how to fix it.

trace

cow_brd: module verification failed: signature and/or required key missing - tainting kernel
cow_brd: module loaded with 1 disks and 20 snapshots
c_harness (1976): drop_caches: 3
hwm: Hello World from module
hwm: Wrapping device /dev/cow_ram_snapshot1_0 with flags device /dev/vdb
hwm: working with queue with:
 flags 0xf02a00
hwm: initialized
hwm: clearing data logs
hwm: turning on data logging
hwm: bio rw of size 12288 headed for 0x200000 (sector 0x1000) has flags:
hwm: making checkpoint in log
hwm: turning off data logging
hwm: no log entry here
hwm: Cleaning up bye!
------------[ cut here ]------------
WARNING: CPU: 6 PID: 1976 at fs/f2fs/node.c:1159 __get_node_page.part.33+0x220/0x420
Modules linked in: cow_brd(OE) [last unloaded: disk_wrapper]
CPU: 6 PID: 1976 Comm: c_harness Tainted: G           OE   4.9.0-rc8+ #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
 ffffacda446cfc28 ffffffffbb41ebaf 0000000000000000 0000000000000000
 ffffacda446cfc68 ffffffffbb0a13db 00000487afdb0800 ffffd84448d2bcc0
 ffff9e11afdb0800 0000000000000003 ffff9e11b05a5d48 0000000000000000
Call Trace:
 [<ffffffffbb41ebaf>] dump_stack+0x63/0x84
 [<ffffffffbb0a13db>] __warn+0xcb/0xf0
 [<ffffffffbb0a150d>] warn_slowpath_null+0x1d/0x20
 [<ffffffffbb36d710>] __get_node_page.part.33+0x220/0x420
 [<ffffffffbb36d922>] get_node_page.part.34+0x12/0x20
 [<ffffffffbb36f20b>] get_node_page+0x1b/0x20
 [<ffffffffbb350f9e>] f2fs_iget+0x11e/0x800
 [<ffffffffbb35c03a>] f2fs_fill_super+0x82a/0x1060
 [<ffffffffbb24e4dc>] mount_bdev+0x17c/0x1b0
 [<ffffffffbb35b810>] ? f2fs_commit_super+0xf0/0xf0
 [<ffffffffbb358485>] f2fs_mount+0x15/0x20
 [<ffffffffbb24efb2>] mount_fs+0x32/0x160
 [<ffffffffbb26ca3d>] vfs_kern_mount.part.18+0x5d/0xf0
 [<ffffffffbb26ef60>] do_mount+0x520/0xc30
 [<ffffffffbb246c9a>] ? __check_object_size+0xba/0x1f0
 [<ffffffffbb2203cf>] ? kmem_cache_alloc_trace+0x15f/0x1c0
 [<ffffffffbb26e82c>] ? copy_mount_options+0x2c/0x220
 [<ffffffffbb26f988>] SyS_mount+0x98/0xe0
 [<ffffffffbb85bd77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
---[ end trace 32fbab667e323f53 ]---
F2FS-fs (cow_ram_snapshot1_0): Failed to read root inode
------------[ cut here ]------------

ashmrtn commented 5 years ago

Thanks for the extra information!

That is actually an interesting dmesg log that you got back. Based on the dmesg output, I think you are correct that the file system somehow gets destroyed.

I'll see if I have some time this weekend to look at this more closely

Hitatm commented 5 years ago

OK,thanks

jayashreemohan29 commented 5 years ago

@Hitatm Thanks for identifying the nit in the shell script. It has been fixed #127

Hitatm commented 5 years ago

@Hitatm Thanks for identifying the nit in the shell script. It has been fixed #127

OK, nice,thanks. @jayashreemohan29

vijay03 commented 5 years ago

@Hitatm @ashmrtn ported Crashmonkey to 4.9 here: #128. Can you test your code with that?

Hitatm commented 5 years ago

@Hitatm @ashmrtn ported Crashmonkey to 4.9 here: #128. Can you test your code with that? Sorry I've just noticed your message, thanks @vijay03 and @ashmrtn , I will do the test and give me a feedback

Hitatm commented 5 years ago

I tested seq-1 on my VM，it works well. thanks for your excellent work. @vijay03 @ashmrtn . (By the way, xfsMonkey.py:94, mkdir cmd add -p option will be better when it exits abnormally @jayashreemohan29 )

Hitatm commented 5 years ago

I mean the driver modules no kernel WARNING,but there are 3 bugs reported, it might be my own fault.

Running test #281 : j-lang34 : Failed test
DIFF: Content Mismatch /foo
        Expected File Size = 32768
        Actual File Size = 36864
Running test #283 : j-lang26 : Failed test
DIFF: Content Mismatch /foo
        Expected File Size = 32768
        Actual File Size = 65536
Running test #312 : j-lang46 : Failed test
DIFF: Content Mismatch /foo
        Expected File Size = 32768
        Actual File Size = 36864

jayashreemohan29 commented 5 years ago

@Hitatm These are bugs. Can you tell me which file system and kernel version you are running these tests on? If you take a look at the workload files corresponding to the bugs you have reported, they are three different scenarios of the fallocate syscall with FALLOC_FL_KEEP_SIZE flag set. In all three cases, although the file size should have been unmodified after fallocate, CrashMonkey is reporting that the file size has changed. CrashMonkey found a similar bug on F2FS (https://github.com/utsaslab/crashmonkey/blob/master/newBugs.md#bug-9--fallocate-beyond-the-eof-recovers-to-an-incorrect-file-size), which corresponds to j-lang26.cpp. But the other two bugs you are reporting seem new.

Hitatm commented 5 years ago

I move the f2fs of andrioid-p to linux 4.9.0-rc8 and test it on qemu vm. I apply the f2fs-bug9 patch and case j-lang26/34 pass thanks @jayashreemohan29. About j-lang46 case, there may be something wrong introduced by myself, I am working on it .

utsaslab / crashmonkey

report two Ace bugs #126

1 at copy_diff.sh:27

2 at ace.py variable file_names， incomplete type judgment