paulfloyd / freebsd_valgrind

Git repo used to Upstream the FreeBSD Port of Valgrind
GNU General Public License v2.0
15 stars 4 forks source link

none/tests/tls hangs #188

Closed paulfloyd closed 2 years ago

paulfloyd commented 2 years ago

FreeBSD 13.1, amd64

Seems intermittent

The testcase has main creating 2x 7 threads, all with the same function tls_ptr, but with data containing a pointer to function (of which there are 7).

The output is something like

paulf@freebsd:~/valgrind/none/tests $ ../../.in_place/none-amd64-freebsd --tool=none ./tls
==2316== Nulgrind, the minimal Valgrind tool
==2316== Copyright (C) 2002-2017, and GNU GPL'd, by Nicholas Nethercote.
==2316== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info
==2316== Command: ./tls
==2316== 
tls_ptr: case "race" has mismatch: *ip=1 here=0
tls_ptr: case "race" has mismatch: *ip=2 here=1
tls_ptr: case "race" has mismatch: *ip=3 here=1
tls_ptr: case "race" has mismatch: *ip=4 here=2
tls_ptr: case "race" has mismatch: *ip=5 here=2
tls_ptr: case "race" has mismatch: *ip=6 here=3
Killed

(requires manual killing, not responding to ctrl-c)

I have no idea if the function/data being tested changes anything (e.g., is it always the global variable test that hangs, or is it randomly any of the seven and only due to the use of signals and threads).

With plenty of tracing

SYSCALL[4081,15](240) ... [async] --> Success(0x0) 
SYSCALL[4081,15](240) sys_nanosleep ( 0x200890, 0x0 ) --> [async] ... 
SYSCALL[4081,2](240) ... [async] --> Success(0x0) 
SYSCALL[4081,2](  4) sys_write ( 1, 0x4ea9000, 49 ) --> [async] ... 
tls_ptr: case "race" has mismatch: *ip=10 here=5
SYSCALL[4081,2](  4) ... [async] --> Success(0x31) 
SYSCALL[4081,2](240) sys_nanosleep ( 0x200890, 0x0 ) --> [async] ... 

ktrace / kdump

  2288 none-amd64-freebsd CALL  write(0x1,0x4ea9000,0x30)
  2288 none-amd64-freebsd GIO   fd 1 wrote 48 bytes
       "tls_ptr: case "race" has mismatch: *ip=4 here=2
       "
  2288 none-amd64-freebsd RET   write 48/0x30
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40396fdb0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x40396fca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x40396fcfe,0x1)
  2288 none-amd64-freebsd GIO   fd 28503 read 1 byte
       "K"
  2288 none-amd64-freebsd RET   read 1
  2288 none-amd64-freebsd CALL  thr_self(0x40396fcb8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  write(0x6f58,0x40396fd06,0x1)
  2288 none-amd64-freebsd GIO   fd 28504 wrote 1 byte
       "L"
  2288 none-amd64-freebsd RET   write 1
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40396fdf0,0x40396fdb0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  nanosleep(0x200890,0)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x403a73db0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x403a73ca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x403a73cfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40445fdb0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x40445fca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x40445fcfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x404563db0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x404563ca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x404563cfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x404667db0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x404667ca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x404667cfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40476bdb0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x40476bca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x40476bcfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40486fdb0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x40486fca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x40486fcfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x404973db0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x404973ca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x404973cfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x403b77db0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x403b77ca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x403b77cfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x403c7bdb0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x403c7bca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x403c7bcfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40404fdb0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x40404fca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x40404fcfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x404153db0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x404153ca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x404153cfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x404257db0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x404257ca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x404257cfe,0x1)
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40435bdb0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x40435bca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x40435bcfe,0x1)
  2288 none-amd64-freebsd GIO   fd 28503 read 1 byte
       "L"
  2288 none-amd64-freebsd RET   read 1
  2288 none-amd64-freebsd RET   nanosleep 0
  2288 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40396fdb0,0)
  2288 none-amd64-freebsd RET   sigprocmask 0
  2288 none-amd64-freebsd CALL  thr_self(0x40396fca8)
  2288 none-amd64-freebsd RET   thr_self 0
  2288 none-amd64-freebsd CALL  read(0x6f57,0x40396fcfe,0x1)
  2288 none-amd64-freebsd RET   _umtx_op -1 errno 4 Interrupted system call
  2288 none-amd64-freebsd PSIG  SIGKILL SIG_DFL code=SI_NOINFO
paulfloyd commented 2 years ago

On FreeBSD 13.0 compiling the testcase with clang 13 doesn't reproduce the issue.

And 13.1 with clang 11 does reproduce.

So this is a Valgrind / OS / libc / libthr problem

I'll try cutting down the testcase, first trimming the types of testcase - just T(race) to start with.

paulfloyd commented 2 years ago

I tried running the TC with truss alone / under Valgrind. The truss/Valgrind log I'm leaving for the moment as it is too long and the diffs are illegible. I'll try again another time, filtering out sigprocmask, thr_self, read and write.

The log running alone under truss shows quite a few differences.

13.0 for thread creation does pid: thr_new(0x7fffffffe180,0x68) = 0 (0x0) pid: <new thread 101826> pid: sigfastblock(0x1,0x800a12538) = 0 (0x0) pid: _umtx_op(0x80026fc80,UMTX_OP_RW_RDLOCK,0x0,0x0,0x0) = 0 (0x0) pid: mmap(0x7fffdfbfc000,2101248,PROT_READ|PROT_WRITE,MAP_STACK,-1,0x0) = 140736947273728 (0x7fffdfbfc000) pid: mprotect(0x7fffdfbfc000,4096,PROT_NONE) = 0 (0x0)

13.1 does pid: thr_new(0x7fffffffe8f0,0x68) = 0 (0x0) pid: mmap(0x7fffdf9fb000,2101248,PROT_READ|PROT_WRITE,MAP_STACK,-1,0x0) = 140736945172480 (0x7fffdf9fb000) pid: mprotect(0x7fffdf9fb000,4096,PROT_NONE) = 0 (0x0)

No sigfastblock, no umtx rw rdlock

I don't think that that makes a difference.

On 13.0 I see

pid: _umtx_op(0x8006b0008,UMTX_OP_MUTEX_WAIT,0x0,0x0,0x0) = 0 (0x0) pid: _umtx_op(0x8006b0008,UMTX_OP_MUTEX_WAKE2,0x0,0x0,0x0) = 0 (0x0)

(sometimes but not always together like that). There are 9 WAITs and 9 WAKE2s (not sure idf they are 100% reproducible).

I don't see any pattern or what is using the mutex (just 1 as far as I can tell).

Also I have no clue as to why 13.1 does not use this mutex.

Next I tried disabling buffering of output (see comment in #189 )

SYSCALL[3098,15](240) sys_nanosleep ( 0x200890, 0x0 )--3098:4:  gdbsrv       VG_(gdbserver_report_syscall) before 1 sysno 240 tid 15
 --> [async] ... 
ALL
--3098--   SCHED[13]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys
--3098--   SCHED[12]:  acquired lock (VG_(client_syscall)[async])
--3098--   SCHED[12]: TRC: SYSCALL
--3098--   SCHED[12]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys
--3098--   SCHED[15]:  acquired lock (VG_(client_syscall)[async])
--3098--   SCHED[15]: TRC: SYSCALL
--3098--   SCHED[15]: releasing lock (VG_(client_syscall)[async]) -> VgTs_WaitSys
--3098--   SCHED[2]:  acquired lock (VG_(client_syscall)[asySYSCALL[3098,2](240) ... [async] --> Success(0x0)--3098:4:  gdbsrv       VG_(gdbserver_report_syscall) before 0 sysno 240 tid 2

SYSCALL[3098,2](  4) sys_write ( 1, 0x4ea9000, 48 )--3098:4:  gdbsrv       VG_(gdbserver_report_syscall) before 1 sysno 4 tid 2
 --> [async] ... 
tls_ptr: case "race" has mismatch: *ip=4 here=2
SYSCALL[3098,2](  4) ... [async] --> Success(0x30)--3098:4:  gdbsrv       VG_(gdbserver_report_syscall) before 0 sysno 4 tid 2

SYSCALL[3098,2](240) sys_nanosleep ( 0x200890, 0x0 )--3098:4:  gdbsrv       VG_(gdbserver_report_syscall) before 1 sysno 240 tid 2
 --> [async] ... 
Killed

It's hanging in a call to nanosleep. That's also what I saw in gdb.

tls_ptr: case "race" has mismatch: *ip=16 here=8
^C
Thread 1 received signal SIGINT, Interrupt.
Sent by kernel.
vgModuleLocal_do_syscall_for_client_WRK () at m_syswrap/syscall-amd64-freebsd.S:135
135             setc    0(%rsp)         /* stash returned carry flag */
(gdb) 

That made me worry a bit about the change I made for resuming syscalls when the call to save the carry flag is being made. Is that really thread-safe, even for Valgrind that is only executing one thread at a time.

I can then single step to

#1  0x00000000380bff4c in vgPlain_do_syscall (sysno=28503, sysno@entry=3, a1=17220295950, a2=1, a2@entry=17220295950, a3=0, a4=a4@entry=0,
    a5=0, a6=0, a7=0, a8=0) at m_syscall.c:1153
#2  0x00000000380becee in vgPlain_read (fd=<optimized out>, buf=0x0, buf@entry=0x40268dd0e, count=count@entry=1) at m_libcfile.c:315
#3  0x0000000038141454 in vgModuleLocal_sema_down (sema=0x4023705d0, as_LL=0 '\000') at m_scheduler/sema.c:107
#4  0x000000003813d50d in vgPlain_acquire_BigLock_LL (who=<optimized out>) at m_scheduler/scheduler.c:425
#5  vgPlain_acquire_BigLock (tid=tid@entry=1, who=0x380064ef "VG_(client_syscall)[async]") at m_scheduler/scheduler.c:349
#6  0x00000000381448a3 in vgPlain_client_syscall (tid=tid@entry=1, trc=<optimized out>, trc@entry=73) at m_syswrap/syswrap-main.c:2353
#7  0x0000000038140340 in handle_syscall (tid=tid@entry=1, trc=trc@entry=73) at m_scheduler/scheduler.c:1211
#8  0x000000003813e463 in vgPlain_scheduler (tid=tid@entry=1) at m_scheduler/scheduler.c:1529
#9  0x0000000038150874 in thread_wrapper (tidW=<optimized out>) at m_syswrap/syswrap-freebsd.c:109
#10 run_a_thread_NORETURN (tidW=1) at m_syswrap/syswrap-freebsd.c:163
#11 0x0000000000000000 in ?? ()

So this is trying to get the semaphore the_BigLock, hanging in the read.

How is it possible to be using 100% CPU in a read like that?

I'm beginning to think that this is some sort of interaction between the guest nanosleep and Valgrind's scheduler lock.

Is it possible that when the scheduler is getting the lock the nanosleep interrups it and then we get stuck in some sort of syscall restart pingpong? Or somehow the semaphore gets into a

If I reduce the nanosleep then the TC seems to run reliably. I'll try twiddling with the nanosleep of 13.0 to see if I can cause the problem.

This also hangs on 14.0 but not 12.2 or 12.3

paulfloyd commented 2 years ago

I ran the TC under ktrace, like this

ktrace -di -t cinpstw ../../vg-in-place --tool=none ./tls

For the hang that I observed it go a bit over half way. 14 threads get created, and they each loop 10 times with a nanosleep each, for a total of 140 nanosleeps. The other source of syscalls from the guest are the 19 printf from the first two threads that inncrementing a plain global variable.

A bit before the end there is

  9340 none-amd64-freebsd GIO   fd 28504 wrote 1 byte
       "X"
  9340 none-amd64-freebsd RET   write 1
  9340 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40396fdf0,0x40396fdb0)
  9340 none-amd64-freebsd RET   sigprocmask 0
  9340 none-amd64-freebsd CALL  nanosleep(0x200890,0)

Then there is a chunk of returning nanosleeps and attemps to read the BigLock pipe

  9340 none-amd64-freebsd CSW   resume kernel "nanslp"
  9340 none-amd64-freebsd RET   nanosleep 0
  9340 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40476bdb0,0)
  9340 none-amd64-freebsd RET   sigprocmask 0
  9340 none-amd64-freebsd CALL  thr_self(0x40476bca8)
  9340 none-amd64-freebsd RET   thr_self 0
  9340 none-amd64-freebsd CALL  read(0x6f57,0x40476bcfe,0x1)
  9340 none-amd64-freebsd CSW   stop kernel "pipelk"
  9340 none-amd64-freebsd CSW   resume kernel "nanslp"
  9340 none-amd64-freebsd RET   nanosleep 0
  9340 none-amd64-freebsd CALL  sigprocmask(SIG_SETMASK,0x40486fdb0,0)
  9340 none-amd64-freebsd RET   sigprocmask 0
  9340 none-amd64-freebsd CALL  thr_self(0x40486fca8)
  9340 none-amd64-freebsd RET   thr_self 0
  9340 none-amd64-freebsd CALL  read(0x6f57,0x40486fcfe,0x1)
  9340 none-amd64-freebsd CSW   stop kernel "pipelk"
  9340 none-amd64-freebsd CSW   resume kernel "nanslp"
  9340 none-amd64-freebsd RET   nanosleep 0

When everything goes pear-shaped there is

  9340 none-amd64-freebsd GIO   fd 28503 read 1 byte
       "X"
  9340 none-amd64-freebsd RET   read 1
  9340 none-amd64-freebsd CSW   stop user "ast"
  9340 none-amd64-freebsd CSW   resume kernel "pipelk"
  9340 none-amd64-freebsd CSW   stop kernel "piperd"
  9340 none-amd64-freebsd CSW   resume kernel "pipelk"
  9340 none-amd64-freebsd CSW   stop kernel "piperd"

So someone has read from the pipe and has the lockj.

After that there is an "ast" asynchronous system trap, an involuntary context switch.. After that the Valgrind scheduler seems to have more or less stopped - I see one attempt to read the pipe. Other than that just a lot of kernel context switches between pipe lock and pipe read.

On FreeBSD 13.1 I also see some "ast"s but they always get paired with a resume.

From this perspective it looks like Valgrind + this testcase is triggering a kernel issue where the "ast" is getting stuck.

paulfloyd commented 2 years ago

Running FreeBSD 13.1 in a 2CPU VM seems to fix the problem. In my opinion, a kernel bug.

See this thread

https://docs.freebsd.org/cgi/getmsg.cgi?fetch=4438867+0+current/freebsd-hackers

paulfloyd commented 2 years ago

I just logged a bug

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=264441