paulfloyd / freebsd_valgrind

Git repo used to Upstream the FreeBSD Port of Valgrind
GNU General Public License v2.0
15 stars 4 forks source link

Numerous Helgrind failures on FreeBSD 12.2 #145

Closed paulfloyd closed 3 years ago

paulfloyd commented 3 years ago

Just taking one example

--- pth_barrier1.stderr.exp 2020-07-07 08:34:29.755177000 +0200
+++ pth_barrier1.stderr.out 2020-11-16 07:09:15.628715000 +0100
@@ -14,6 +14,22 @@
    by 0x........: barriers_and_races (pth_barrier.c:92)
    by 0x........: main (pth_barrier.c:122)

+----------------------------------------------------------------
+
+Possible data race during read of size 8 at 0x........ by thread #x
+Locks held: none
+   at 0x........: threadfunc (pth_barrier.c:54)
+   by 0x........: mythread_wrapper (hg_intercepts.c:...)
+   ...
+
+This conflicts with a previous write of size 8 by thread #x
+Locks held: none
+   ...
+   by 0x........: threadfunc (pth_barrier.c:54)
+   by 0x........: mythread_wrapper (hg_intercepts.c:...)
+   ...
+ Address 0x........ is in a rw- mapped file /usr/home/paulf/scratch/valgrind/drd/tests/pth_barrier segment
+
 ---Thread-Announcement------------------------------------------

 Thread #x is the program's root thread

With luck this is either an intercept or a suppression problem. Otherwise it is going to be harder to fix.

paulfloyd commented 3 years ago

Doesn't seem obvious. Running a load of traces

../../vg-in-place --tool=helgrind -v -v -v -v -d -d -d -d --trace-redir=yes ../../drd/tests/pth_barrier 2 1 1

and then comparing 12.2 and 12.1 I see that

12.2 hase changed mmaps a bit (one extra file rw section mapped). This looks interesting. The diff above contains

Next thing I tried was running the test app under ktrace. There are some differences in the order of the calls to fork, thr_* and _umtx_op.

paulfloyd commented 3 years ago

Soooo, switching to bar_trivial is it is simpler

readelf -a says that we have

000000204128 000900000007 R_X86_64_JUMP_SLOT 0000000000000000 pthread_barrier_wait + 0

and the 1st error message is

==46108== Address 0x204128 is in a rw- mapped file /usr/home/paulf/scratch/valgrind/helgrind/tests/bar_trivial segment

Possibly we're not correctly handling 2 RW LOAD segments (Linux only has 1).

paulfloyd commented 3 years ago

Fixed with this

https://github.com/paulfloyd/freebsd_valgrind/commit/d78c85a5058ffbf080e5099d092e779cb013c9e2