paulfloyd / freebsd_valgrind

Git repo used to Upstream the FreeBSD Port of Valgrind
GNU General Public License v2.0
15 stars 4 forks source link

Out by one error in line info for helgrind/tests/locked_vs_unlocked3 #134

Open paulfloyd opened 4 years ago

paulfloyd commented 4 years ago

(similar issue for locked_vs_unlocked2)

Overview: main() creates two threads, fn1 and fn2. fn1 double locks a recursive mutex and writes to a global. fn2 doesn't lock and writes to the same variable.

I added an extra expected for this as the difference is minor.

The original expected has

at 0x........: child_fn1 (locked_vs_unlocked3.c:28)

with the source code

27:   r= pthread_mutex_lock(&mx);  assert(!r);
28:   x = 1;

But clang gives

at 0x........: child_fn1 (locked_vs_unlocked3.c:27)

The testcase doesn't use any Valgrind options (other than -q) so this isn't a local/global/inline issue.

@todo run clang and gcc outside of regtest to get full messages. I expect the addresses will be different so that won't help.

On Linux:

==5436== Possible data race during write of size 4 at 0x4040C8 by thread #2
==5436== Locks held: none
==5436==    at 0x4012B3: child_fn2 (locked_vs_unlocked3.c:38)
==5436==    by 0x40394FD: mythread_wrapper (hg_intercepts.c:406)
==5436==    by 0x4A2BDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==5436==    by 0x4D3DEAC: clone (in /usr/lib64/libc-2.17.so)
==5436==
==5436== This conflicts with a previous write of size 4 by thread #3
==5436== Locks held: 1, at address 0x4040A0
==5436==    at 0x40122A: child_fn1 (locked_vs_unlocked3.c:28)
==5436==    by 0x40394FD: mythread_wrapper (hg_intercepts.c:406)
==5436==    by 0x4A2BDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==5436==    by 0x4D3DEAC: clone (in /usr/lib64/libc-2.17.so)
==5436==  Address 0x4040c8 is 0 bytes inside data symbol "x"

On FreeBSD:

==2608== Possible data race during write of size 4 at 0x204360 by thread #2
==2608== Locks held: none
==2608==    at 0x201D5A: child_fn2 (locked_vs_unlocked3.c:38)
==2608==    by 0x485A8D6: mythread_wrapper (hg_intercepts.c:406)
==2608==    by 0x486D82A: ??? (in /lib/libthr.so.3)
==2608== 
==2608== This conflicts with a previous write of size 4 by thread #3
==2608== Locks held: 1, at address 0x204368
==2608==    at 0x201C51: child_fn1 (locked_vs_unlocked3.c:27)
==2608==    by 0x485A8D6: mythread_wrapper (hg_intercepts.c:406)
==2608==    by 0x486D82A: ??? (in /lib/libthr.so.3)
==2608==  Address 0x204360 is 0 bytes inside data symbol "x"
paulfloyd commented 2 years ago

There are lots of debug flags to turn on: UInt VG_(get_StackTrace_wrk) ( ThreadId tid_if_known,

Well, actually one per function above per platform. Well it's something. --trace-symtab -d -d -d

In mdebuginfo/debuginfo.c debug flags in Bool VG(lookup_symbol_SLOW)(DiEpoch ep, consider_vars_in_frame analyse_deps di_get_stack_blocks_at_ip

tsan same error, line number right?

m_debuginfo/storage.c

if defined(VGO_freebsd)

if (sym->size == 0) sym->size = 1;

endif

fishy! this forces zero sized syms to be not ignored. Does it change anything?

DiLoc::lineno seems to be what I'm looking for

void ML_(addLineInfo) ( struct _DebugInfo* di, static const Bool debug = True; That seems to be for code, not data.

On Linux with GCC I get

src ix 1 /home/pafloyd/scratch/valgrind/helgrind/tests locked_vs_unlocked3.c line 28 0x40122a-0x401234

FreeBSD clang:

src ix 14 /usr/home/paulf/scratch/valgrind/helgrind/tests locked_vs_unlocked3.c line 27 0x201c35-0x201c44 src ix 14 /usr/home/paulf/scratch/valgrind/helgrind/tests locked_vs_unlocked3.c line 27 0x201c44-0x201c47 src ix 14 /usr/home/paulf/scratch/valgrind/helgrind/tests locked_vs_unlocked3.c line 27 0x201c47-0x201c7e src ix 14 /usr/home/paulf/scratch/valgrind/helgrind/tests locked_vs_unlocked3.c line 28 0x201c7e-0x201c89

Dump of assembler code for function child_fn1:
   0x0000000000201be0 <+0>:     push   %rbp
   0x0000000000201be1 <+1>:     mov    %rsp,%rbp
   0x0000000000201be4 <+4>:     sub    $0x10,%rsp
   0x0000000000201be8 <+8>:     mov    %rdi,-0x8(%rbp)
   0x0000000000201bec <+12>:    movabs $0x204368,%rdi
   0x0000000000201bf6 <+22>:    call   0x2020a0 <pthread_mutex_lock@plt>
   0x0000000000201bfb <+27>:    mov    %eax,-0xc(%rbp)
   0x0000000000201bfe <+30>:    cmpl   $0x0,-0xc(%rbp)
   0x0000000000201c02 <+34>:    jne    0x201c0d <child_fn1+45>
   0x0000000000201c08 <+40>:    jmp    0x201c35 <child_fn1+85>
   0x0000000000201c0d <+45>:    movabs $0x200818,%rdi
   0x0000000000201c17 <+55>:    movabs $0x20082a,%rsi
   0x0000000000201c21 <+65>:    mov    $0x1a,%edx
   0x0000000000201c26 <+70>:    movabs $0x200822,%rcx
   0x0000000000201c30 <+80>:    call   0x2020b0 <__assert@plt>
start line 27
=> 0x0000000000201c35 <+85>:    movabs $0x204368,%rdi
   0x0000000000201c3f <+95>:    call   0x2020a0 <pthread_mutex_lock@plt>
   0x0000000000201c44 <+100>:   mov    %eax,-0xc(%rbp)
   0x0000000000201c47 <+103>:   cmpl   $0x0,-0xc(%rbp)
   0x0000000000201c4b <+107>:   jne    0x201c56 <child_fn1+118>
Message says error is here
   0x0000000000201c51 <+113>:   jmp    0x201c7e <child_fn1+158>
   0x0000000000201c56 <+118>:   movabs $0x200818,%rdi
   0x0000000000201c60 <+128>:   movabs $0x20082a,%rsi
   0x0000000000201c6a <+138>:   mov    $0x1b,%edx
   0x0000000000201c6f <+143>:   movabs $0x200822,%rcx
   0x0000000000201c79 <+153>:   call   0x2020b0 <__assert@plt>
start line 28, error is here
   0x0000000000201c7e <+158>:   movl   $0x1,0x204360
   0x0000000000201c89 <+169>:   movabs $0x204368,%rdi
   0x0000000000201c93 <+179>:   call   0x2020c0 <pthread_mutex_unlock@plt>
   0x0000000000201c98 <+184>:   mov    %eax,-0xc(%rbp)
   0x0000000000201c9b <+187>:   cmpl   $0x0,-0xc(%rbp)
   0x0000000000201c9f <+191>:   jne    0x201caa <child_fn1+202>
   0x0000000000201ca5 <+197>:   jmp    0x201cd2 <child_fn1+242>
   0x0000000000201caa <+202>:   movabs $0x200818,%rdi
   0x0000000000201cb4 <+212>:   movabs $0x20082a,%rsi
   0x0000000000201cbe <+222>:   mov    $0x1d,%edx
   0x0000000000201cc3 <+227>:   movabs $0x200822,%rcx

Interesting. When the mutext lock succeeds we have a conditional jump to the assert code which isn't taken then an unconditional jump to the assignment.

So in terms of execution it's the previous instruction.

GCC just emits a single conditional jump.

The ball is back in the Helgrind camp. Why is it saying error at 0x0000000000201c51 rather than 0x0000000000201c7e?

paulfloyd commented 2 years ago

Does splitting

r= pthread_mutex_lock(&mx); assert(!r);

onto separate lines change anything?

No.