paulfloyd / freebsd_valgrind

Git repo used to Upstream the FreeBSD Port of Valgrind
GNU General Public License v2.0
15 stars 4 forks source link

memcheck/tests/lks is failing [clang] #89

Closed paulfloyd closed 2 years ago

paulfloyd commented 4 years ago

The behaviour does not depend on the compiler used to build Valgrind. Only the compiler used for the testcase matters.

This is an issue with client requests. The testcase generates some leaks then uses client request to display a summary. With a GCC compiled testcase this produces

   definitely lost: 48 bytes in 3 blocks
   indirectly lost: 32 bytes in 2 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 64 bytes in 4 blocks
        suppressed: 96 bytes in 6 blocks
Rerun with --leak-check=full to see details of leaked memory

leaked:      80 bytes in  5 blocks
dubious:      0 bytes in  0 blocks
reachable:   64 bytes in  4 blocks
suppressed:  96 bytes in  6 blocks

(and this is consistent with the normal summary).

With a clang built testcase the output is

   definitely lost: 32 bytes in 2 blocks
   indirectly lost: 16 bytes in 1 blocks
     possibly lost: 0 bytes in 0 blocks
   still reachable: 80 bytes in 5 blocks
        suppressed: 112 bytes in 7 blocks
Rerun with --leak-check=full to see details of leaked memory

leaked:      48 bytes in  3 blocks
dubious:      0 bytes in  0 blocks
reachable:   80 bytes in  5 blocks
suppressed: 112 bytes in  7 blocks

The total is the same but there is 1 block less in both of definitely/indirectly lost and 1 block more in both of still reachable/suppressed.

leak-cases-full leak-cases-summary leak-cycle look like they could be the same issue. With a full report GCC has two extra items

16 bytes in 1 blocks are indirectly lost in loss record ... of ... at 0x........: malloc (vg_replace_malloc.c:...) by 0x........: mk (leak-cases.c:52) by 0x........: f (leak-cases.c:91) by 0x........: main (leak-cases.c:107)

and

32 (16 direct, 16 indirect) bytes in 1 blocks are definitely lost in loss record ... of ... at 0x........: malloc (vg_replace_malloc.c:...) by 0x........: mk (leak-cases.c:52) by 0x........: f (leak-cases.c:91) by 0x........: main (leak-cases.c:107

Both on line 91 of leak-cases.c

The extra 'suppressed' is probably a 'possible'

nbriggs commented 4 years ago

I've seen the same pattern, with clang, in some other failing tests (the names of which I don't have handy)

paulfloyd commented 4 years ago

The passes on Linux with clang 11

nbriggs commented 4 years ago

I just tried, on FreeBSD 12.1-RELEASE-p5, x86, building the test case (leak-cases) with clang-devel (= 11.0), while valgrind itself was built with clang 9.0. It failed in almost the same way:

$ perl tests/vg_regtest memcheck/tests/lks
lks:             valgrind   --leak-check=full --show-leak-kinds=definite,possible,indirect --errors-for-leak-kinds=definite,possible --suppressions=lks.supp ./leak-cases 
*** lks failed (stderr) ***

== 1 test, 1 stderr failure, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
memcheck/tests/lks                       (stderr)

$ more memcheck/tests/lks.stderr.diff
--- lks.stderr.exp      2020-04-11 12:53:49.388095000 -0700
+++ lks.stderr.out      2020-05-26 13:32:36.075509000 -0700
@@ -3,16 +3,16 @@

 LEAK SUMMARY:
    definitely lost: 48 bytes in 3 blocks
-   indirectly lost: 32 bytes in 2 blocks
+   indirectly lost: 16 bytes in 1 blocks
      possibly lost: 0 bytes in 0 blocks
    still reachable: 64 bytes in 4 blocks
-        suppressed: 96 bytes in 6 blocks
+        suppressed: 112 bytes in 7 blocks
 Rerun with --leak-check=full to see details of leaked memory

-leaked:      80 bytes in  5 blocks
+leaked:      64 bytes in  4 blocks
 dubious:      0 bytes in  0 blocks
 reachable:   64 bytes in  4 blocks
-suppressed:  96 bytes in  6 blocks
+suppressed: 112 bytes in  7 blocks

 HEAP SUMMARY:
     in use at exit: 240 bytes in 15 blocks
$ 

I'll see if I can build a working valgrind with clang 11.0 and how that compares. I had no success building it with clang 10.0.

paulfloyd commented 4 years ago

I changed leak.h to use a normal leak check rather than a quick one.

Line by line the comparison between GCC and clang is

** Still Reachable **
- Line 70, one leak of 16 bytes, same
- Line 72, two leaks of 15 bytes, same
- Line 81, one leak of 16 bytes, same
- Line 91, one leak of 16 bytes, clang only, comments say this should be 16 bytes direct and 16 bytes indirect

** Indirectly Lost **
- Line 76, one leak of 16 bytes, same
- Line 91, one leak of 16 bytes, GCC only, the Still Reachable above?

** Definitely Lost **
- Line 74, one leak of 16 bytes, same
- Line 76, two leaks of 16d and 16i bytes , same
- Line 91, two leaks of 16d and 16i bytes , GCC only 2nd part of Indirectly Lost above.

So the difference seems to come down to

   p9 = mk(mk(NULL));   // Case 9: 16/1 indirectly lost (counted again below!)
   (p9->next)++;                // 32(16d,16i)/1 definitely lost (double count!)
   p9 = NULL;

GCC sees the allocations as definite direct and indirect leaks.

clang sees them as still reachable and possible. I tried using calloc instead of malloc to try to ensure that the stray pointers aren't in some recycled memory or something like that, but no change.

paulfloyd commented 4 years ago

I can cut the testcase down to

void f(void)
{
   p9 = mk(mk(NULL));   // Case 9: 16/1 indirectly lost (counted again below!)
   //fprintf(stderr, "DEBUG: initial p9->next %p\n", p9->next);
   (p9->next)++;                // 32(16d,16i)/1 definitely lost (double count!)
   //fprintf(stderr, "DEBUG: final p9 %p p9->next %p\n", p9, p9->next);
   p9 = NULL;
}

Unfortunately, uncommenting the fprintfs also causes the output to switch to being the same as GCC.

The clang assembler is

00000000002012e0 <mk>:
  2012e0:   55                      push   %rbp
  2012e1:   48 89 e5                mov    %rsp,%rbp
  2012e4:   48 83 ec 10             sub    $0x10,%rsp
  2012e8:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
  2012ec:   bf 10 00 00 00          mov    $0x10,%edi
  2012f1:   e8 2a 07 00 00          callq  201a20 <malloc@plt>
  2012f6:   48 89 45 f0             mov    %rax,-0x10(%rbp)
  2012fa:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  2012fe:   48 8b 7d f0             mov    -0x10(%rbp),%rdi
  201302:   48 89 07                mov    %rax,(%rdi)
  201305:   48 8b 45 f0             mov    -0x10(%rbp),%rax
  201309:   48 83 c4 10             add    $0x10,%rsp
  20130d:   5d                      pop    %rbp
  20130e:   c3                      retq   
  20130f:   90                      nop    

0000000000201310 <f>:
  201310:   55                      push   %rbp
  201311:   48 89 e5                mov    %rsp,%rbp
  201314:   31 c0                   xor    %eax,%eax
  201316:   89 c7                   mov    %eax,%edi
  201318:   e8 c3 ff ff ff          callq  2012e0 <mk>
  20131d:   48 89 c7                mov    %rax,%rdi
  201320:   e8 bb ff ff ff          callq  2012e0 <mk>
  201325:   48 89 04 25 28 40 20    mov    %rax,0x204028
  20132c:   00 
  20132d:   48 8b 04 25 28 40 20    mov    0x204028,%rax
  201334:   00 
  201335:   48 8b 38                mov    (%rax),%rdi
  201338:   48 83 c7 08             add    $0x8,%rdi
  20133c:   48 89 38                mov    %rdi,(%rax)
  20133f:   48 c7 04 25 28 40 20    movq   $0x0,0x204028
  201346:   00 00 00 00 00 
  20134b:   5d                      pop    %rbp
  20134c:   c3                      retq   
  20134d:   0f 1f 00                nopl   (%rax)

and GCC

0000000000400a75 <mk>:
  400a75:   55                      push   %rbp
  400a76:   48 89 e5                mov    %rsp,%rbp
  400a79:   48 83 ec 20             sub    $0x20,%rsp
  400a7d:   48 89 7d e8             mov    %rdi,-0x18(%rbp)
  400a81:   bf 10 00 00 00          mov    $0x10,%edi
  400a86:   e8 45 fa ff ff          callq  4004d0 <malloc@plt>
  400a8b:   48 89 45 f8             mov    %rax,-0x8(%rbp)
  400a8f:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  400a93:   48 8b 55 e8             mov    -0x18(%rbp),%rdx
  400a97:   48 89 10                mov    %rdx,(%rax)
  400a9a:   48 8b 45 f8             mov    -0x8(%rbp),%rax
  400a9e:   c9                      leaveq 
  400a9f:   c3                      retq   

0000000000400aa0 <f>:
  400aa0:   55                      push   %rbp
  400aa1:   48 89 e5                mov    %rsp,%rbp
  400aa4:   bf 00 00 00 00          mov    $0x0,%edi
  400aa9:   e8 c7 ff ff ff          callq  400a75 <mk>
  400aae:   48 89 c7                mov    %rax,%rdi
  400ab1:   e8 bf ff ff ff          callq  400a75 <mk>
  400ab6:   48 89 05 bb 0a 20 00    mov    %rax,0x200abb(%rip)        # 601578 <p9>
  400abd:   48 8b 05 b4 0a 20 00    mov    0x200ab4(%rip),%rax        # 601578 <p9>
  400ac4:   48 8b 10                mov    (%rax),%rdx
  400ac7:   48 83 c2 08             add    $0x8,%rdx
  400acb:   48 89 10                mov    %rdx,(%rax)
  400ace:   48 c7 05 9f 0a 20 00    movq   $0x0,0x200a9f(%rip)        # 601578 <p9>
  400ad5:   00 00 00 00 
  400ad9:   90                      nop    
  400ada:   5d                      pop    %rbp
  400adb:   c3     
paulfloyd commented 2 years ago

The leak tests pass with clang on FreeBSD 13 amd64. I need to spend some time working though the assembler and also the memcheck leak detection code.

paulfloyd commented 2 years ago

commit 98774bffd21bdeb0e4f2903b8df9083fd93c243d (HEAD -> master, origin/master, origin/HEAD) Author: Paul Floyd pjfloyd@wanadoo.fr Date: Thu May 19 22:40:03 2022 +0200

Clobber ecx for clang x86 leak tests

The assembler for leak-cases.c on x86 with clang for f() ends with

  40198b:       c7 04 24 00 00 00 00    movl   $0x0,(%esp)
  401992:       e8 c9 fe ff ff          call   401860 <mk>
  401997:       89 04 24                mov    %eax,(%esp)
  40199a:       e8 c1 fe ff ff          call   401860 <mk>
  40199f:       a3 74 40 40 00          mov    %eax,0x404074
  4019a4:       a1 74 40 40 00          mov    0x404074,%eax
  4019a9:       8b 08                   mov    (%eax),%ecx
  4019ab:       83 c1 08                add    $0x8,%ecx
  4019ae:       89 08                   mov    %ecx,(%eax)
  4019b0:       c7 05 74 40 40 00 00    movl   $0x0,0x404074
  4019b7:       00 00 00
  4019ba:       83 c4 04                add    $0x4,%esp
  4019bd:       5d                      pop    %ebp
  4019be:       c3                      ret

If I've read that correctly, at the enc ECX contains the pointer
to allocated memory returned by mk() plus 8.

main() doesn't clobber ECX either, so this shows up in the
leak checks.

Clobbering ECX fixes the following testcases on FreeBSD 13.1 x86 with clang 13

< gdbserver_tests/mcblocklistsearch        (stderrB)
< memcheck/tests/leak-cases-full           (stderr)
< memcheck/tests/leak-cases-summary        (stderr)
< memcheck/tests/leak-cycle                (stderr)
< memcheck/tests/leak-tree                 (stderr)
< memcheck/tests/lks                       (stderr)

Now only

== 693 tests, 15 stderr failures, 1 stdout failure, 0 stderrB failures, 0 stdoutB failures, 0 post failures == memcheck/tests/dw4 (stderr) memcheck/tests/leak_cpp_interior (stderr) memcheck/tests/manuel1 (stderr) memcheck/tests/origin5-bz2 (stderr) memcheck/tests/varinfo2 (stderr) memcheck/tests/varinfo5 (stderr) memcheck/tests/varinfo6 (stderr) helgrind/tests/tls_threads (stderr) drd/tests/annotate_smart_pointer (stderr) drd/tests/annotate_trace_memory (stderr) drd/tests/annotate_trace_memory_xml (stderr) drd/tests/atomic_var (stderr) drd/tests/pth_mutex_signal (stderr) none/tests/fdleak_cmsg (stderr) none/tests/manythreads (stdout) none/tests/manythreads (stderr)