will explicit-address shmat() be supported?

GoogleCodeExporter commented 9 years ago

After a successful try at running a large application with ASAN I made an 
attempt at TSAN.  TSAN throws a NULL dereference SEGV on the first access to a 
shared memory segment mapped with an explicit address and I have the distinct 
impression this case is either not supported or has not yet been considered.  
Segments are successfully mapped in either the 0x6000_0000_0000 or 
0x7000_0000_0000 ranges which are not (at least initially) occupied by TSAN.  
With ASAN segment base addresses were biased and successfully mapped in the 
0x5000_0000_0000 range.

Can provide details and a concise test-case if a desire to pursue the issue 
exists.  If not please document the lack of support--have found no mention of 
it.  Have long preferred explicit mapping addresses though the conventional 
wisdom is to let the OS decide where to place them.

This app works with Valgrind and Helgrind, though here Valgrind core is 
compiled such that it occupies 0x7_0000_0000 and the application's normal 
preference for shamt() mappings is left unadjusted, occupying the range 
0x3_0000_0000 to 0x6_FFFF_FFFF.

GCC 4.9.1

Original issue reported on code.google.com by starligh...@gmail.com on 28 Oct 2014 at 1:48

GoogleCodeExporter commented 9 years ago

You should be able to mmap data at 0x7e80-0x7f00 under tsan. Does it work for 
you?

Original comment by dvyu...@google.com on 28 Oct 2014 at 2:22

GoogleCodeExporter commented 9 years ago

Indeed, it does.  Thanks!

Original comment by starligh...@gmail.com on 28 Oct 2014 at 2:50

GoogleCodeExporter commented 9 years ago

Came upon a glitch now:  Seems a display app
that attaches one of the segments without specifying
the mapping address traps about half of the time
it is invoked:

[pid  1629] shmget(0x7048082, 0, SHM_HUGETLB|0) = 9240582
[pid  1629] shmat(9240582, 0, SHM_RDONLY) = ?
[pid  1629] shmctl(9240582, IPC_STAT, 0x7fff640fc510) = 0
[pid  1629] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
[pid  1630] +++ killed by SIGSEGV +++

Other half of the time it works.  Should I open
a different issue for this or continue on this
one?  Not a big deal to me but I can try to
help figure it out.

Original comment by starligh...@gmail.com on 28 Oct 2014 at 4:44

GoogleCodeExporter commented 9 years ago

kernel is CentOS 6.5 2.6.32-431.11.2.el6.x86_64

Original comment by starligh...@gmail.com on 28 Oct 2014 at 4:45

GoogleCodeExporter commented 9 years ago

What does gdb say? Is it NULL defer? Where?

p.s. I will be away for next two weeks.

Original comment by dvyu...@google.com on 28 Oct 2014 at 4:48

GoogleCodeExporter commented 9 years ago

when run with 'gdb' it always works--bit of a pain

can't seem to get a core--tried TSAN_OPTIONS=abort=1

this is not a big-deal issue (works fine with non-TSAN version
of the display utility, which is single-thread anyway)
so enjoy your time off

Original comment by starligh...@gmail.com on 28 Oct 2014 at 4:55

GoogleCodeExporter commented 9 years ago

dspl[1573]: segfault at 800040 ip 00007fafa31b2513 sp 00007fff05270f50 error 4 
in
 libtsan.so.0.0.0[7fafa3185000+a8000]
dspl[1579]: segfault at 800040 ip 00007f95e83a9513 sp 00007fff37ba6520 error 4 
in
 libtsan.so.0.0.0[7f95e837c000+a8000]
dspl[1590]: segfault at 800040 ip 00007f0081af0513 sp 00007fffae3f42b0 error 4 
in
 libtsan.so.0.0.0[7f0081ac3000+a8000]

do get the above in 'dmesg', perhaps I should turn off ASLR to find the line 
number?

Original comment by starligh...@gmail.com on 28 Oct 2014 at 4:59

GoogleCodeExporter commented 9 years ago

after a bunch of whacking I think this is correct:

(gdb) disassemble 0x7FFFF6F33DF3,+10
Dump of assembler code from 0x7ffff6f33df3 to 0x7ffff6f33dfd:

   0x00007ffff6f33df3 <__interceptor_longjmp(__sanitizer::uptr*, int)+35>:     data32 callq 0x7ffff6f06500 <__tls_get_addr@plt>

Original comment by starligh...@gmail.com on 28 Oct 2014 at 5:24

GoogleCodeExporter commented 9 years ago

I set '/proc/sys/kernel/randomize_va_space' to 0 and it works every time

Original comment by starligh...@gmail.com on 28 Oct 2014 at 5:31

GoogleCodeExporter commented 9 years ago

Regarding, shmat w/o explicit address. If it's important to you I will ask you 
to provide a minimal reproducer for crash. And it's not important I think we 
don't need to waste any time on it.
You indicated that it's not very important to you, right? Can we close this 
issue now?

Original comment by dvyu...@google.com on 17 Nov 2014 at 2:48

GoogleCodeExporter commented 9 years ago

Before we close the issue, please note
that the explicit address problem was
resolved by following your initial
advice to map in the 0x7e80-0x7f00
range.

What's left is perhaps a second issue,
which is random failure of a non-explicit
(i.e. system assigned) mapping of a segment
by a small single-thread display utility.
I can try to create a test case since it's
not complicated.  Was trying to see how
interested in that issue you are and
whether a separate issue should be created
for it.

Original comment by starligh...@gmail.com on 19 Nov 2014 at 2:34

GoogleCodeExporter commented 9 years ago

If you create a standalone repro the shmat issue, I will definitely look into 
it.
Please file a separate issue for it. Closing this one.
Thanks

Original comment by dvyu...@google.com on 19 Nov 2014 at 7:57

Changed state: Fixed

theRockLiu / thread-sanitizer

will explicit-address shmat() be supported? #82