Closed BishrutSubedi closed 3 weeks ago
CI failure caused by issue https://github.com/eclipse-uprotocol/up-conan-recipes/issues/9
@BishrutSubedi - It looks like the change in 38cb0ba worked. The tests that failed are sensitive to memory allocation throughput, which is reduced when valgrind is running. I think it might be safe to disable those tests when running valgrind. What do you think?
@BishrutSubedi - It looks like the change in 38cb0ba worked. The tests that failed are sensitive to memory allocation throughput, which is reduced when valgrind is running. I think it might be safe to disable those tests when running valgrind. What do you think?
Sounds good to me
I've been poking at this a little more and I'm not sure how to interpret results out of it. For example, I have a run where I added an intentional memory leak. The valgrind output from the log looks like this:
==2705== Memcheck, a memory error detector
==2705== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2705== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==2705== Command: ctest -j -E UuidBuilderTest.WithIndependentState|UuidBuilderTest.CounterIncrementWithinTimestampTick|TestUuidBuilder.CounterIncrements
==2705== Parent PID: 2699
==2705==
==2705==
==2705== HEAP SUMMARY:
==2705== in use at exit: 72,704 bytes in 1 blocks
==2705== total heap usage: 182,638 allocs, 182,637 frees, 90,297,980 bytes allocated
==2705==
==2705== 72,704 bytes in 1 blocks are still reachable in loss record 1 of 1
==2705== at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==2705== by 0x431875: ??? (in /usr/local/bin/ctest)
==2705== by 0x11C93CC: ??? (in /usr/local/bin/ctest)
==2705== by 0x4988E1B: __libc_start_main@@GLIBC_2.34 (libc-start.c:375)
==2705== by 0x431BA2: ??? (in /usr/local/bin/ctest)
==2705== by 0x1FFEFFF6C7: ???
==2705== by 0x1B: ???
==2705== by 0x3: ???
==2705== by 0x1FFEFFFC4E: ???
==2705== by 0x1FFEFFFC54: ???
==2705== by 0x1FFEFFFC57: ???
==2705== by 0x1FFEFFFC5A: ???
==2705==
{
<insert_a_suppression_name_here>
Memcheck:Leak
match-leak-kinds: reachable
fun:malloc
obj:/usr/local/bin/ctest
obj:/usr/local/bin/ctest
fun:__libc_start_main@@GLIBC_2.34
obj:/usr/local/bin/ctest
obj:*
obj:*
obj:*
obj:*
obj:*
obj:*
obj:*
}
==2705== LEAK SUMMARY:
==2705== definitely lost: 0 bytes in 0 blocks
==2705== indirectly lost: 0 bytes in 0 blocks
==2705== possibly lost: 0 bytes in 0 blocks
==2705== still reachable: 72,704 bytes in 1 blocks
==2705== suppressed: 0 bytes in 0 blocks
==2705==
==2705== For lists of detected and suppressed errors, rerun with: -s
==2705== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
When I run one of the test binaries under valgrind directly, I get this instead:
==9291== Memcheck, a memory error detector
==9291== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==9291== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==9291== Command: ./CallbackConnectionTest
==9291==
==9291==
==9291== HEAP SUMMARY:
==9291== in use at exit: 108 bytes in 27 blocks
==9291== total heap usage: 498 allocs, 471 frees, 150,404 bytes allocated
==9291==
==9291== 4 bytes in 1 blocks are definitely lost in loss record 1 of 27
==9291== at 0x4846FA3: operator new(unsigned long) (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==9291== by 0x126FF8: uprotocol::utils::callbacks::CalleeHandle<void>::CalleeHandle(std::shared_ptr<uprotocol::utils::callbacks::Connection<void> >, std::shared_ptr<std::function<void ()> >, std::optional<std::function<void (uprotocol::utils::callbacks::CallerHandle<void>)> >&&, uprotocol::utils::callbacks::Connection<void>::PrivateConstructToken) (CallbackConnection.h:262)
==9291== by 0x12DCCE: uprotocol::utils::callbacks::Connection<void>::establish(std::function<void ()>&&, std::optional<std::function<void (uprotocol::utils::callbacks::CallerHandle<void>)> >&&) (CallbackConnection.h:109)
==9291== by 0x11AF10: (anonymous namespace)::CallbackTest_EstablishDoesNotThrow_Test::TestBody() (CallbackConnectionTest.cpp:46)
==9291== by 0x178674: void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2612)
==9291== by 0x16FDFC: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (gtest.cc:2648)
==9291== by 0x146E8B: testing::Test::Run() (gtest.cc:2687)
==9291== by 0x147A5C: testing::TestInfo::Run() (gtest.cc:2836)
==9291== by 0x1484D5: testing::TestSuite::Run() (gtest.cc:3015)
==9291== by 0x159659: testing::internal::UnitTestImpl::RunAllTests() (gtest.cc:5920)
==9291== by 0x179993: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (gtest.cc:2612)
==9291== by 0x171358: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (gtest.cc:2648)
==9291==
...
==9291==
==9291== LEAK SUMMARY:
==9291== definitely lost: 108 bytes in 27 blocks
==9291== indirectly lost: 0 bytes in 0 blocks
==9291== possibly lost: 0 bytes in 0 blocks
==9291== still reachable: 0 bytes in 0 blocks
==9291== suppressed: 0 bytes in 0 blocks
==9291==
==9291== For lists of detected and suppressed errors, rerun with: -s
==9291== ERROR SUMMARY: 27 errors from 27 contexts (suppressed: 0 from 0)
In both cases, I modified the process to use the Debug build type just to see if adding debug symbols helped.
I've been poking at this a little more
With the current setup, the valgrind logs will have the names of test that failed. For more detailed breakdown of lines that failed, I run the failed test locally directly under valgrind.
If build with build type debug, we get the exact lines that caused failure. In release, we get only the top level function call of where it failed, but the subsequent lines are hidden.
We can think this as summary of what failed, and investigate locally to see the result.
One way to get this detail result in CI would be to collect tests, under test dir, and run it directly with valgrind and concatenate the results
This looks good for me. Since there are some valgrind findings already, I have added #196 as a follow-up to log those issues and #197 to block PRs on valgrind failures once all findings are addressed.
This PR adds dynamic analysis check i.e. valgrind to run in all unit test in CI
It also adds a text file to exclude any test to run under valgrind and it's tools
To skip test from running, add it in format
<test binary name>.<testsuite name>.<test name>