cwgreene / thread-sanitizer

Automatically exported from code.google.com/p/thread-sanitizer
0 stars 0 forks source link

building firefox with TSan support leads to an unusable browser #92

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
[Apologies for the bugreport here if this is the wrong place; nothing on the 
LLVM bugzilla looked appropriate.]

What steps will reproduce the problem?
1. Build according to the instructions at 
https://developer.mozilla.org/en-US/docs/Thread_Sanitizer#Building_Firefox
2. Run simple tests from the srcdir with "mach mochitest-plain --quiet 
layout/style/test/"

What is the expected output? What do you see instead?

The tests should run successfully, albeit reporting a number of races.  They do 
run successfully with a non-TSan build.

With a TSan build, numerous JavaScript errors are reported on the browser's 
stdout.  These JavaScript errors prevent the tests from running.  I suspect 
some sort of miscompilation or similar, as the version of TSan distributed with 
clang 3.3 works.  I am unable to test with 3.4, 3.5, or 3.6, as they either 
error out while compiling Firefox or give similar results to the above.

What version of the product are you using? On what operating system?

I have observed this on x86-64 Linux, with a version of the sources ~svn 
r230634.

Original issue reported on code.google.com by nfr...@mozilla.com on 8 Apr 2015 at 8:38

GoogleCodeExporter commented 9 years ago
> The tests should run successfully, albeit reporting a number of races.

Are the reported races true races? Or you believe they are false positives.
If they are true races, then file an issue in firefox tracker.
If they false positives, then please explain why.

Original comment by dvyu...@google.com on 9 Apr 2015 at 7:19

GoogleCodeExporter commented 9 years ago
Sorry, that was poorly worded.  The reported races would be true races, and we 
are tracking them:

https://bugzilla.mozilla.org/show_bug.cgi?id=929478

Original comment by nfr...@mozilla.com on 9 Apr 2015 at 9:44

GoogleCodeExporter commented 9 years ago
What is then the issue here? Do you suspect some kind of miscompilation with 
tsan?

Does the program work when compiled with the same clang with different levels 
of optimization but without -fsanitize=thread? Such failures can also be caused 
by uses of unitialized memory, out of bounds accesses and other kinds of 
undefined behaviors. Tsan significantly affects generated code and so can 
provoke latent bugs.

Original comment by dvyu...@google.com on 9 Apr 2015 at 9:55

GoogleCodeExporter commented 9 years ago
Yes, I do suspect some kind of miscompilation with TSan.

The program works just fine with the same clang at the same level of 
optimizations or higher without -fsanitize=thread.  I admit to not having 
tested with throwing -fPIC or -fpie into the mix, but I am doubtful those flags 
will change very much.

Original comment by nfr...@mozilla.com on 9 Apr 2015 at 10:03

GoogleCodeExporter commented 9 years ago
How do these JavaScript errors look like? Is it like a variable should be equal 
to value X but under tsan is equal to Y? Try to debug when and why it gets the 
wrong value.

Original comment by dvyu...@google.com on 9 Apr 2015 at 10:13

GoogleCodeExporter commented 9 years ago
It looks like what is happening is that we have JavaScript helper threads that 
request a stack just smaller than the minimum size TSan enforces.  TSan bumps 
that size up, but requisitions most of that stack for its own purposes.  We 
then run code that does a primitive form of stack overflow checking:

http://mxr.mozilla.org/mozilla-central/source/js/src/jsfriendapi.h#1001

on those threads, but the limits of the stack are significantly different than 
what they expect, because TSan is already using so much of the stack.  So the 
maximum amount of recursion allowed is much smaller, and causes problems when 
attempting to parse reasonable JavaScript code.

I understand where TSan twiddles the amount of stack space requested, but it's 
not completely clear to me where the stack is actually adjusted for the new 
thread to reserve the allocated space for TSan's purposes.  So that's a hole in 
the logic above.

We've worked around this in:

https://bugzilla.mozilla.org/show_bug.cgi?id=1153244

by just doubling the amount of stack required for these helper threads when 
we're compiling with TSan, but it'd be good to have some sort of fix on the 
TSan side (maybe adding TSan_required_space to requested_stack_space if the two 
are relatively close, instead of using 
max(TSan_required_space,requested_stack_space)?) as well.

Original comment by nfr...@mozilla.com on 15 Apr 2015 at 3:49

GoogleCodeExporter commented 9 years ago
Glad you debugged the issue and made tsan work.
If you want us to fix it in tsan, please provide the following details: size of 
stack that you request, size of stack that your program uses under tsan 
(minimal requested stack size value that makes the program work).

Original comment by dvyu...@google.com on 20 Apr 2015 at 10:52

GoogleCodeExporter commented 9 years ago
We normally request a 512kB stack.

Why does it matter what size stack makes things actually work?  Don't you 
more-or-less have to assume that the program will use nearly all its requested 
stack size anyway?

Original comment by nfr...@mozilla.com on 27 Apr 2015 at 3:09

GoogleCodeExporter commented 9 years ago
> Why does it matter what size stack makes things actually work?  Don't you 
more-or-less have to assume that the program will use nearly all its requested 
stack size anyway?

It is all a bit tricky.
First, it is difficult to calculate what would be TLS size without tsan (asan, 
msan). Then, we don't know the exact increase of stack frames (it can be 
negligible as well as huge). Then, most people just use default size, and it is 
more than enough, in such case increasing stack size will just waste memory.
The current logic is based on some simple heuristics. There are not much 
projects that set thread size. So I want to know the firefox numbers.

Also, do you use pthread_attr_setstacksize or pthread_attr_setstack? If you use 
pthread_attr_setstack, we then won't be able to adjust size, because you give 
the actual memory for stack.

Original comment by dvyu...@google.com on 29 Apr 2015 at 4:40

GoogleCodeExporter commented 9 years ago
We use pthread_attr_setstacksize prior to creating the thread.

I don't know what the exact minimum size is yet; I can figure out about how 
much it takes to be able to start the browser, but figuring out what might be 
enough to run a sufficient number of tests will take a while.

Original comment by nfr...@mozilla.com on 29 Apr 2015 at 5:07