paulfloyd / freebsd_valgrind

Git repo used to Upstream the FreeBSD Port of Valgrind
GNU General Public License v2.0
15 stars 4 forks source link

FreeBSD 14.0 descr_belowsp is failing #207

Closed paulfloyd closed 7 months ago

paulfloyd commented 7 months ago

The diff is that in the description it no longer says that it is in the guard page.

describing 0x1ffafff880 1500 bytes below a local var
==93086== Thread 2:
==93086== Invalid read of size 1
==93086==    at 0x202498: bad_things_till_guard_page (descr_belowsp.c:74)
==93086==    by 0x20200F: child_fn_0 (descr_belowsp.c:113)
==93086==    by 0x486CA74: ??? (in /lib/libthr.so.3)
==93086==  Address 0x1ffafffccf is on thread 2's stack
==93086==  401 bytes below stack pointer

But the memory map is

93086       0x1ffadff000       0x1ffae00000 ---    0    0   0   0 ----- gd 
93086       0x1ffae00000       0x1ffafe0000 ---    0    0   0   0 ----- gd 
93086       0x1ffafe0000       0x1ffb000000 rw-    2    2   1   0 ---D- sw 

0x1ffafffccf looks OK to me - it is below 0x1ffafe0000.

0x1fccf or 130225

Also strange that there are two guard pages.

Standalone 14.0

94233     0x7fffdfdfd000     0x7fffdfdfe000 ---    0    0   0   0 ----- gd 
94233     0x7fffdfdfe000     0x7fffdfdff000 ---    0    0   0   0 ----- gd 
94233     0x7fffdfdff000     0x7fffdfe1e000 rw-   31   31   1   0 ---D- sw 
94233     0x7fffdfe1e000     0x7fffdfe3e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdfe3e000     0x7fffdfe5e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdfe5e000     0x7fffdfe7e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdfe7e000     0x7fffdfe9e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdfe9e000     0x7fffdfebe000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdfebe000     0x7fffdfede000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdfede000     0x7fffdfefe000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdfefe000     0x7fffdff1e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdff1e000     0x7fffdff3e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdff3e000     0x7fffdff5e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdff5e000     0x7fffdff7e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdff7e000     0x7fffdff9e000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdff9e000     0x7fffdffbe000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdffbe000     0x7fffdffde000 rw-   32   32   1   0 ---D- sw 
94233     0x7fffdffde000     0x7fffdfffe000 rw-   32   32   1   0 ---D- sw

Looking at 13.2

==12039== Thread 2:
==12039== Invalid read of size 1
==12039==    at 0x2026B8: bad_things_till_guard_page (descr_belowsp.c:74)
==12039==    by 0x20222F: child_fn_0 (descr_belowsp.c:113)
==12039==    by 0x486DA79: ??? (in /lib/libthr.so.3)
==12039==  Address 0x1ffafffccf is on thread 2's stack
==12039==  401 bytes below stack pointer

So that's the same faulting address 0x1ffafffccf.

And the address map

12039       0x1ffadff000       0x1ffafe0000 ---    0    0   0   0 ----- gd 
12039       0x1ffafe0000       0x1ffb000000 rw-    2    2   1   0 ---D- df 

And I don't understand that either. Only 128k

13.2 standalone

12063     0x7fffdfdfd000     0x7fffdfdfe000 ---    0    0   0   0 ----- gd 
12063     0x7fffdfdfe000     0x7fffdfe1e000 rw-   32   32   1   0 ---D- df 

$1 = 0x7fffdfdfdfbf <error: Cannot access memory at address 0x7fffdfdfdfbf>

That stack is only 128K ???

Ah no it's been growing

12063     0x7fffdfdfd000     0x7fffdfdfe000 ---    0    0   0   0 ----- gd 
12063     0x7fffdfdfe000     0x7fffdfe1e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdfe1e000     0x7fffdfe3e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdfe3e000     0x7fffdfe5e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdfe5e000     0x7fffdfe7e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdfe7e000     0x7fffdfe9e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdfe9e000     0x7fffdfebe000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdfebe000     0x7fffdfede000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdfede000     0x7fffdfefe000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdfefe000     0x7fffdff1e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdff1e000     0x7fffdff3e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdff3e000     0x7fffdff5e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdff5e000     0x7fffdff7e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdff7e000     0x7fffdff9e000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdff9e000     0x7fffdffbe000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdffbe000     0x7fffdffde000 rw-   32   32   1   0 ---D- df 
12063     0x7fffdffde000     0x7fffdfffe000 rw-   32   32   1   0 ---D- df

0x7fffdfffe000 to 0x7fffdfdfe000 16*128 that's 2M which I think is the default.

OK that all makes sense.

paulfloyd commented 7 months ago

--sanity-level=3 also fails with both 13.2 and 14

../../vg-in-place --sanity-level=3 ./descr_belowsp
==12173== Memcheck, a memory error detector
==12173== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==12173== Using Valgrind-3.23.0.GIT and LibVEX; rerun with -h for copyright info
==12173== Command: ./descr_belowsp
==12173== 
describing 0x1ffc000350 1500 bytes below a local var
==12173==  Address 0x1ffc000350 is on thread 1's stack
==12173==  1392 bytes below stack pointer
--12173:0: aspacem segment mismatch: V's seg 1st, kernel's 2nd:
--12173:0: aspacem  58: anon 1ffadff000-1ffaffffff 2101248 rw--- SmFixed d=0x000 i=0       o=0       (-1,-1) (none)
--12173:0: aspacem ...: .... 1ffadff000-1ffafdffff 1970176 ---.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--12173:0: aspacem sync check at m_aspacemgr/aspacemgr-linux.c:2282 (Bool vgPlain_am_notify_client_mmap(Addr, SizeT, UInt, UInt, Int, Off64T)): FAILED
--12173:0: aspacem 
--12173:0: aspacem Valgrind: FATAL: aspacem assertion failed:
--12173:0: aspacem   VG_(am_do_sync_check) (__PRETTY_FUNCTION__,__FILE__,__LINE__)
--12173:0: aspacem   at m_aspacemgr/aspacemgr-linux.c:2282 (Bool vgPlain_am_notify_client_mmap(Addr, SizeT, UInt, UInt, Int, Off64T))
--12173:0: aspacem Exiting now.
paulfloyd commented 7 months ago

Running a single threaded app seem OK. Works OK for Linux.

In truss

mmap(0x0,4096,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34966913024 (0x82430c000)
munmap(0x820f61000,12288)                        = 0 (0x0)
mmap(0x0,147456,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34981048320 (0x825087000)
mmap(0x8009cd000,2101248,PROT_READ|PROT_WRITE,MAP_STACK,-1,0x0) = 34993930240 (0x825cd0000)
mprotect(0x825cd0000,4096,PROT_NONE)             = 0 (0x0)
new thread
mmap(0x0,135168,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 35003871232 (0x82664b000)
SIGNAL 11 (SIGSEGV) code=SEGV_ACCERR trapno=12 addr=0x825cd0fbf

In Valgrind

SYSCALL[1387,1]( 74) sys_mprotect ( 0x203000, 4096, 1 )[sync] --> Success(0x0) 

segfault

SYSCALL[1387,1](477) sys_mmap ( 0x0, 4096, 3, 4098, 4294967295, 0x0) --> [pre-success] Success(0x4841000) 
SYSCALL[1387,1]( 73) sys_munmap ( 0x483f000, 8192 )[sync] --> Success(0x0) 
SYSCALL[1387,1](477) sys_mmap ( 0x0, 147456, 3, 4098, 4294967295, 0x0) --> [pre-success] Success(0x588a000) 
SYSCALL[1387,1](477) sys_mmap ( 0x1ffadff000, 2101248, 3, 1024, 4294967295, 0x0)
--1387:0: aspacem segment mismatch: V's seg 1st, kernel's 2nd:
--1387:0: aspacem  60: anon 1ffadff000-1ffaffffff 2101248 rw--- SmFixed d=0x000 i=0       o=0       (-1,-1) (none)
--1387:0: aspacem ...: .... 1ffadff000-1ffafdffff 1970176 ---.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--1387:0: aspacem sync check at m_aspacemgr/aspacemgr-linux.c:2282 (Bool vgPlain_am_notify_client_mmap(Addr, SizeT, UInt, UInt, Int, Off64T)): FAILED
paulfloyd commented 7 months ago

I've started making a few changes. The problem is that a FreeBSD stack mmap results in two mappings.

@todo update this with some experimental results.

aslr complicates matters a bit

paulfloyd commented 7 months ago

See also

https://bugs.kde.org/show_bug.cgi?id=481203

paulfloyd commented 7 months ago

And I think that this is a kernel bug.

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=277382

paulfloyd commented 7 months ago

I added a comment to the code for a very hacky possible workaround. And just setting sysctl security.bsd.stack_guard_page=0 seems to get back the old behaviour.

paulfloyd commented 7 months ago

To get these traces I modified maddrinfo.c after about line 240 to put something like VG(system)("procstat -v [pid]") (with a VG(sprintf) and a VG(getpid) to build up the actual string)

Then I ran truss -o truss.log ./vg-in-place ./descr_belowsp

to give

==6808== Invalid read of size 1
==6808==    at 0x202558: bad_things_till_guard_page (descr_belowsp.c:74)
==6808==    by 0x20209F: child_fn_0 (descr_belowsp.c:120)
==6808==    by 0x486DA74: ??? (in /lib/libthr.so.3)
==6808==  Address 0x1ffeffeccf is on thread 2's stack
==6808==  401 bytes below stack pointer

and

 6808       0x100512f000       0x1005131000 ---    0    0   1   0 CN--- sw 
 6808       0x1ffedfe000       0x1ffedff000 ---    0    0   0   0 ----- gd 
 6808       0x1ffedff000       0x1ffee00000 ---    0    0   0   0 ----- gd 
 6808       0x1ffee00000       0x1ffee1f000 rw-   31   31   1   0 CN-D- sw 
 6808       0x1ffee1f000       0x1ffee3f000 rw-   32   32   1   0 CN-D- sw 

and truss

mmap(0x1ffedfe000,2101248,PROT_READ|PROT_WRITE,MAP_FIXED|MAP_STACK,-1,0x0) = 137420070912 (0x1ffedfe000)
mmap(0x1004d69000,16384,PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_FIXED|MAP_ANON,-1,0x0) = 68800647168 (0x1004d69000)
mmap(0x1004d6d000,16384,PROT_READ|PROT_WRITE|PROT_EXEC,MAP_PRIVATE|MAP_FIXED|MAP_ANON,-1,0x0) = 68800663552 (0x1004d6d000)
mprotect(0x1ffedfe000,4096,PROT_NONE)            = 0 (0x0)