sysprog21 / simplefs

A simple native file system for Linux kernel
Other
362 stars 91 forks source link

Fix sb_read and umount cache not be freed #53

Closed RoyWFHuang closed 1 month ago

RoyWFHuang commented 3 months ago

Using kmemleak to detect the leak, I found the the command "ls" and "mv" will make the cache leak

here are the leak informations:

0 [<ffffffffa5d2f98e>] __alloc_pages+0x24e
1 [<ffffffffa5d2f98e>] __alloc_pages+0x24e
2 [<ffffffffa5d51d3e>] alloc_pages+0x9e
3 [<ffffffffa5cbd4de>] __page_cache_alloc+0x7e
4 [<ffffffffa5cc1342>] pagecache_get_page+0x152
5 [<ffffffffa5dedec8>] grow_dev_page+0x48
6 [<ffffffffa5deebac>] __getblk_gfp+0xbc
7 [<ffffffffa5deed01>] __bread_gfp+0x11
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ <-- seems __bread() leak.
8 [<ffffffffc098427b>] ftrace_trampoline+0x427b
9 [<ffffffffc09845a7>] ftrace_trampoline+0x45a7
10 [<ffffffffa5dafaed>] vfs_mkdir+0xad
11 [<ffffffffa5db3368>] do_mkdirat+0x128
12 [<ffffffffa5db351c>] __x64_sys_mkdir+0x4c
13 [<ffffffffa5a04d64>] x64_sys_call+0x94
14 [<ffffffffa67c2ce6>] do_syscall_64+0x56
15 [<ffffffffa68000df>] entry_SYSCALL_64_after_hwframe+0x67
jserv commented 3 months ago

I defer to @HotMercury for confirmation.

HotMercury commented 3 months ago

Hi @RoyWFHuang There is something wrong on my environment to use kmemleak, I will check later.

RoyWFHuang commented 3 months ago

Hi @RoyWFHuang There is something wrong on my environment to use kmemleak, I will check later.

@HotMercury error: attempt to use poisoned "strlcpy" Is that the problem you encounter?

If that, you can go to bpftool/libbpf/ and update it to latest version

HotMercury commented 3 months ago

Thank you for your suggestions. I believe your logic in checking is correct, but I am using kmemleak, and it does not show any leak errors. Please give me more about testing environment like kernel version and below information.

  BPFTOOL  bpftool/bootstrap/bpftool
...                        libbfd: [ on  ]
...               clang-bpf-co-re: [ on  ]
...                          llvm: [ on  ]
...                        libcap: [ on  ]

In my test, there is no error

$ sudo ./kmodleak simplefs -v
module 'simplefs' loaded
module 'simplefs' unloaded

3 stacks with outstanding allocations:
8192 bytes in 2 allocations from stack
    addr = 0xa33 size = 4096
    addr = 0xc2b size = 4096
    0 [<ffffffff9284e7e4>] __alloc_pages+0x264
    1 [<ffffffff9284e7e4>] __alloc_pages+0x264
    2 [<ffffffff928859b1>] alloc_pages_mpol+0x91
    3 [<ffffffff92885f14>] folio_alloc+0x64
    4 [<ffffffff927b5734>] filemap_alloc_folio+0xf4
    5 [<ffffffff927bac5b>] __filemap_get_folio+0x14b
    6 [<ffffffff929411f2>] __getblk_slow+0xb2
    7 [<ffffffff92941591>] __bread_gfp+0x81
    8 [<ffffffffc0710b59>] init_iso9660_fs+0x6b49
    9 [<ffffffff928ff674>] iterate_dir+0x114
    10 [<ffffffff928fff94>] __x64_sys_getdents64+0x84
    11 [<ffffffff92405b73>] x64_sys_call+0x1b43
    12 [<ffffffff9361b78f>] do_syscall_64+0x7f
    13 [<ffffffff9380012b>] entry_SYSCALL_64_after_hwframe+0x73
208 bytes in 2 allocations from stack
    addr = 0xffff898c35e6c8f0 size = 104
    addr = 0xffff898c1d3add00 size = 104
    0 [<ffffffff9285c8a3>] kmem_cache_alloc+0x253
    1 [<ffffffff9285c8a3>] kmem_cache_alloc+0x253
    2 [<ffffffff9293d6be>] alloc_buffer_head+0x1e
    3 [<ffffffff9293ebb5>] folio_alloc_buffers+0x95
    4 [<ffffffff92941238>] __getblk_slow+0xf8
    5 [<ffffffff92941591>] __bread_gfp+0x81
    6 [<ffffffffc0710b59>] init_iso9660_fs+0x6b49
    7 [<ffffffff928ff674>] iterate_dir+0x114
    8 [<ffffffff928fff94>] __x64_sys_getdents64+0x84
    9 [<ffffffff92405b73>] x64_sys_call+0x1b43
    10 [<ffffffff9361b78f>] do_syscall_64+0x7f
    11 [<ffffffff9380012b>] entry_SYSCALL_64_after_hwframe+0x73
192 bytes in 1 allocations from stack
    addr = 0xffff898c015ef180 size = 192
    0 [<ffffffff9285c55b>] kmem_cache_alloc_lru+0x25b
    1 [<ffffffff9285c55b>] kmem_cache_alloc_lru+0x25b
    2 [<ffffffff929046e4>] __d_alloc+0x34
    3 [<ffffffff9290493a>] d_alloc+0x1a
    4 [<ffffffff92907c7a>] d_alloc_parallel+0x5a
    5 [<ffffffff928f2b0c>] __lookup_slow+0x5c
    6 [<ffffffff928f2e0c>] lookup_one_unlocked+0x9c
    7 [<ffffffff928f2ebd>] lookup_positive_unlocked+0x1d
    8 [<ffffffff92aa673d>] debugfs_lookup+0x5d
    9 [<ffffffff92aa679f>] debugfs_lookup_and_remove+0xf
    10 [<ffffffff9285fd69>] debugfs_slab_release+0x19
    11 [<ffffffff927fee5c>] kmem_cache_destroy+0x11c
    12 [<ffffffffc070c625>] init_iso9660_fs+0x2615
    13 [<ffffffffc0710fe9>] init_iso9660_fs+0x6fd9
    14 [<ffffffff925f6943>] __do_sys_delete_module.isra.0+0x1a3
    15 [<ffffffff925f6af2>] __x64_sys_delete_module+0x12
    16 [<ffffffff92406320>] x64_sys_call+0x22f0
    17 [<ffffffff9361b78f>] do_syscall_64+0x7f
    18 [<ffffffff9380012b>] entry_SYSCALL_64_after_hwframe+0x73
done
RoyWFHuang commented 3 months ago

You can test the example in kmodleak, you will see that

1 stacks with outstanding allocations:
128 bytes in 1 allocations from stack
addr = 0xffff8c4204ae7300 size = 128
0 [<ffffffffa5d61b11>] kmem_cache_alloc_trace+0x261
1 [<ffffffffa5d61b11>] kmem_cache_alloc_trace+0x261
2 [<ffffffffc09ae02b>] 
bpf_prog_cc9504c91d44b81c_kmodleak__mm_page_free+0x127f
3 [<ffffffffa5a03a06>] do_one_initcall+0x46
4 [<ffffffffa5b9ec12>] do_init_module+0x52
5 [<ffffffffa5ba062b>] load_module+0xb2b
6 [<ffffffffa5a955b0>] __kretprobe_trampoline+0x0
7 [<ffffffffa5ba0a08>] __x64_sys_finit_module+0x18
8 [<ffffffffa5a06793>] x64_sys_call+0x1ac3
9 [<ffffffffa67c2ce6>] do_syscall_64+0x56
10 [<ffffffffa68000df>] entry_SYSCALL_64_after_hwframe+0x67
done

The checker will not show "error" or "leak" (like valgrind), it only shows the allocating path (that's the eBpf actually doing) So if you see that something show out, that means some leaking in the program

And as your comments, there are 3 leaking in it

7 [<ffffffff92941591>] __bread_gfp+0x81
5 [<ffffffff92941591>] __bread_gfp+0x81
11 [<ffffffff927fee5c>] kmem_cache_destroy+0x11c

Another "11 [] kmem_cache_destroy+0x11c", The problem bothers me, Could you give me you mail that we can discuss this?

jserv commented 3 months ago

@HotMercury, can you confirm the effectiveness of this proposed change?

HotMercury commented 2 months ago

Hi @RoyWFHuang You can use rcu_barrier() or a kernel thread to solve the problem where simplefs_destroy_inode_cache does not release in time before unmounting simplefs.

jserv commented 2 months ago

You can use rcu_barrier() or a kernel thread to solve the problem where simplefs_destroy_inode_cache does not release in time before unmounting simplefs.

You should paste the proposed changes for dedicated discussions.

jserv commented 1 month ago

Make the git commit messages informative. In particular, mention the way how you figure out the potential leaks.

@RoyWFHuang, can you improve the git commit messages?

RoyWFHuang commented 1 month ago

Make the git commit messages informative. In particular, mention the way how you figure out the potential leaks.

@RoyWFHuang, can you improve the git commit messages?

Yes, just wait a few days, I need some time to conclude it.

jserv commented 1 month ago

Thank @RoyWFHuang for contributing!