Closed james-cook closed 2 years ago
Using information from:
Go ahead and open a separate issue for the assertion failure. It would be helpful if you could run rmlint in gdb (gdb --args rmlint ...) and print a backtrace. It seems like it should actually be impossible without some kind of corruption so building with ASAN would also be useful (CFLAGS='-fsanitize=address' LDFLAGS='-fsanitize=address' scons DEBUG=1 ).
This is the run with the recompiled rmlint, using the same command on the same directories and files:
(gdb) run
Starting program: /usr/bin/rmlint --progress -S dma -s -1TB --keep-all-tagged DIR1 // DIR2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
[New Thread 0xb638c380 (LWP 5790)]
[New Thread 0xb59ff380 (LWP 5791)]
▕░░░░░░░░░░░░░░░░░░░░░░░░░▏ Traversing (25834 usable files / 4013 + 2 ignored files / folders)
[Thread 0xb638c380 (LWP 5790) exited]
**
ERROR:lib/pathtricia.c:80:rm_node_check_inode: assertion failed: (node->inode != RM_NO_INODE)
Thread 1 "rmlint" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0xb6a4df14 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0xb6a39230 in __GI_abort () at abort.c:79
#2 0xb6cbc8a8 in g_assertion_message () at /lib/arm-linux-gnueabihf/libglib-2.0.so.0
#3 0xb6cbc948 in g_assertion_message_expr () at /lib/arm-linux-gnueabihf/libglib-2.0.so.0
#4 0x00020378 in rm_node_check_inode ()
#5 0x00020550 in rm_node_get_inode ()
#6 0x0002082c in rm_file_parent_inode ()
#7 0x00020850 in rm_file_cmp_samefile ()
#8 0x00020a4c in rm_file_cmp_samefile_full ()
#9 0xb6c8e01c in () at /lib/arm-linux-gnueabihf/libglib-2.0.so.0
(gdb) bt full
#0 0xb6a4df14 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
set = {__val = {0 <repeats 27 times>, 100, 1, 0, 57, 7}}
pid = <optimized out>
tid = <optimized out>
#1 0xb6a39230 in __GI_abort () at abort.c:79
save_stage = 1
act =
{__sigaction_handler = {sa_handler = 0x10, sa_sigaction = 0x10}, sa_mask = {__val = {0, 0, 492432, 557472, 4246540800, 117, 492432, 3204439296, 557472, 117, 0, 509800, 1, 509800, 3066799548, 3067431920, 3070224744, 3067434900, 0, 3070224744, 93, 1962934272, 3066674540, 509800, 0, 0, 4246540800, 509800, 509800, 3067435364, 94, 3070224744}}, sa_flags = 344292, sa_restorer = 0xbeffdd64}
sigs = {__val = {32, 0 <repeats 31 times>}}
#2 0xb6cbc8a8 in g_assertion_message () at /lib/arm-linux-gnueabihf/libglib-2.0.so.0
#3 0xb6cbc948 in g_assertion_message_expr () at /lib/arm-linux-gnueabihf/libglib-2.0.so.0
#4 0x00020378 in rm_node_check_inode ()
#5 0x00020550 in rm_node_get_inode ()
#6 0x0002082c in rm_file_parent_inode ()
#7 0x00020850 in rm_file_cmp_samefile ()
#8 0x00020a4c in rm_file_cmp_samefile_full ()
#9 0xb6c8e01c in () at /lib/arm-linux-gnueabihf/libglib-2.0.so.0
(gdb)
ASAN:
Compiling with the flags shown:
sudo CFLAGS='-fsanitize=address' LDFLAGS='-fsanitize=address' scons DEBUG=1 --prefix=/usr install
leads to an error when I run the program in gdb:
(gdb) run
Starting program: /usr/bin/rmlint --progress -S dma -s -1TB --keep-all-tagged DIR1 // DIR2
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
==7199==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
[Inferior 1 (process 7199) exited with code 01]
(gdb)
head of the compile "log":
Scons: Reading SConscript files ...
>> Appending custom build flags : -fsanitize=address
>> Appending custom link flags : -fsanitize=address
Checking whether the C compiler works... yes
Checking for git revision... (cached) yes
Checking for pkg-config... (cached) yes
ASAN generates its own reports so gdb isn't necessary. Seems like part of rmlint may have been built without ASAN. Try a clean rebuild with those flags:
$ scons -c
$ export CFLAGS='-fsanitize=address' LDFLAGS='-fsanitize=address'
$ scons config
$ scons DEBUG=1
$ sudo -E scons DEBUG=1 --prefix=/usr install
If you get the same error you can work around the issue with LD_PRELOAD=/usr/lib/libasan.so rmlint ...
. If ASAN reports nothing besides leaks you could also try valgrind on a clean build without ASAN (valgrind rmlint ...
).
Investigating...
Just FYI, from the compiles (not just with the sanitise flags):
scons DEBUG=1
s/timestamp.c
Compiling ==> lib/formats/uniques.c
Compiling ==> lib/fts/fts.c
Building manpage from rst...
Using sphinx-build binary: /usr/bin/sphinx-build
Linking Static Library ==> librmlint.a
Ranlib Library ==> librmlint.a
Linking Program ==> rmlint
/usr/bin/ld: librmlint.a(reflink.o): in function `rm_dedupe_main':
reflink.c:(.text+0x249c): warning: lchmod is not implemented and will always fail
Cannot import `sphinx_bootstrap_theme`; falling back to `nature`.
^ This is no error, will cause only slightly different html output.
Zipping manpage...
scons: done building targets.
Not sure if the rm_dedupe_main - reflink.c - lchmod warning is important.
Raspberry pi does not have libasan at the location you mentioned.
I found it at /usr/lib/gcc/arm-linux-gnueabihf/8/libasan.so (assuming this is the correct libasan for gcc 8) (it's the only libasan.so under /usr)
Is it OK just to link in the dynamic lib and run as shown below or must I install libasan5 explicitly?
LD_PRELOAD=/usr/lib/gcc/arm-linux-gnueabihf/8/libasan.so rmlint --progress -S dma -s -1TB --keep-all-tagged DIR1 // DIR2
=================================================================
==10047==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xb2a03b80 at pc 0x00082e00 bp 0xb05fc20c sp 0xb05fc204
READ of size 8 at 0xb2a03b80 thread T2 (pool)
#0 0x82dff in rm_file_new (/usr/bin/rmlint+0x82dff)
#1 0x4d017 in rm_traverse_file (/usr/bin/rmlint+0x4d017)
#2 0x4f503 in rm_traverse_directory (/usr/bin/rmlint+0x4f503)
#3 0x36ad7 in rm_mds_factory (/usr/bin/rmlint+0x36ad7)
0xb2a03b80 is located 8 bytes to the right of 88-byte region [0xb2a03b20,0xb2a03b78)
allocated by thread T2 (pool) here:
#0 0xb6a8bbbb in __interceptor_malloc (/usr/lib/gcc/arm-linux-gnueabihf/8/libasan.so+0xe1bbb)
#1 0x79c57 in fts_alloc (/usr/bin/rmlint+0x79c57)
Thread T2 (pool) created by T0 here:
#0 0xb69f59c7 in pthread_create (/usr/lib/gcc/arm-linux-gnueabihf/8/libasan.so+0x4b9c7)
#1 0xb66be523 (/lib/arm-linux-gnueabihf/libglib-2.0.so.0+0x9c523)
SUMMARY: AddressSanitizer: heap-buffer-overflow (/usr/bin/rmlint+0x82dff) in rm_file_new
Shadow bytes around the buggy address:
0x36540720: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
0x36540730: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
0x36540740: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 05
0x36540750: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
0x36540760: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
=>0x36540770:[fa]fa fa fa fd fd fd fd fd fd fd fd fd fd fd fa
0x36540780: fa fa fa fa fd fd fd fd fd fd fd fd fd fd fd fa
0x36540790: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
0x365407a0: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
0x365407b0: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 fa
0x365407c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==10047==ABORTING
This doesn't look like the original error though (?) - I don't see the initial traversal output on screen. And, as you mention, these are only leaks.
I am able to reproduce the heap-buffer-overflow report with a 32-bit x86 build, but not on x86_64, so I suspect some errors in RM_PLATFORM_32 related code. Will debug further, thanks.
Try patching lib/config.h.in
like this and rebuilding:
diff --git a/lib/config.h.in b/lib/config.h.in
index e9a5a3c0..30fda4e2 100644
--- a/lib/config.h.in
+++ b/lib/config.h.in
@@ -57,6 +57,7 @@
#define LLI G_GINT64_FORMAT
+#include <stdint.h> /* for UINTPTR_MAX */
#define RM_PLATFORM_32 (UINTPTR_MAX == 0xffffffff)
#define RM_PLATFORM_64 (UINTPTR_MAX == 0xffffffffffffffff)
I can confirm that the patch fixes the assertion failure on my platform. Thanks :)
Closing. Please re-open if needed.
Version: latest develop branch, --version shows 2.10.1 Platform - Raspberry pi 4 with 4GB RAM
I went back and recompiled 2.10.1 master to check and it compiles and runs without error with this same command, same directories, same files.
Note: This is a placeholder for the failure and investigation. Hopefully I will have more time next week to recompile and run as advised here: https://github.com/sahib/rmlint/issues/547#issuecomment-1019547768