Closed torognes closed 4 years ago
It is possible that this is a false positive reported by helgrind due to the use of POSIX condition variables. Paragraph 3 under section 7.5 in the helgrind documentation encourages the use of POSIX semaphores instead of condition variables because helgrind is unable to fully control the latter and may report false positives.
The problem may only appear when compiling for coverage testing because the program runs much slower than usual and perhaps not in parallel.
I'll try to rewrite the code to use semaphores instead of condition variables to see if that helps. Other have reported better performance with semaphores as well.
It seems like the problems persist even if not using condition variables. They may be caused directly by the modifications made by the compiler when preparing for coverage testing.
We could compile swarm twice, once normally for ordinary testing and once for coverage testing. The first time we fail if any of the tests fail. The second time we just ignore failed tests.
Alternatively, we could just disable the helgrind tests.
Alternatively, we could just disable the helgrind tests.
I vote for that option. I think it is safe to assume that the helgrind warnings we get are a side effect of the code coverage instrumentation, as they do not occur when testing our release binary. We can always enable helgrind tests again when we want to validate modifications of the threading implementation.
Ok, let's keep the helgrind tests disabled for now.
Some of the swarm tests that use helgrind, a part of valgrind, fail when swarm is compiled for coverage testing (more specificly with the
-fprofile-arcs
option).The tests that run swarm with nonzero d (e.g.
-d 1
or-d 2
) and more than one thread (e.g.-t 2
) fail due to reports of possible data race conditions, like below (with-d 1 -t 2
):