Genivia / ugrep

NEW ugrep 6.5: a more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more
https://ugrep.com
BSD 3-Clause "New" or "Revised" License
2.56k stars 109 forks source link

valgrind: Source and destination overlap in memcpy - Zthread::decompress() (ugrep.cpp:1205) #389

Closed vt-alt closed 3 months ago

vt-alt commented 3 months ago

I did quick smoke testing of 6.0.0 under valgrind on i586 and it reports the error:

+ valgrind --error-exitcode=2 --track-fds=yes --trace-children=yes --track-origins=yes ugrep -r -z -I -l std::chrono --stats
==3472237== Memcheck, a memory error detector
==3472237== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3472237== Using Valgrind-3.17.0 and LibVEX; rerun with -h for copyright info
==3472237== Command: ugrep -r -z -I -l std::chrono --stats
==3472237==
==3472237== Thread 15:
==3472237== Source and destination overlap in memcpy(0x8e1f454, 0x8e1f454, 5875)
==3472237==    at 0x48473CD: memcpy (vg_replace_strmem.c:1051)
==3472237==    by 0x12FC01: UnknownInlinedFun (string_fortified.h:29)
==3472237==    by 0x12FC01: UnknownInlinedFun (zstream.hpp:1583)
==3472237==    by 0x12FC01: Zthread::decompress() (ugrep.cpp:1205)
==3472237==    by 0x4B2642F: ??? (in /usr/lib/libstdc++.so.6.0.32)
==3472237==    by 0x4DFCA67: ??? (in /usr/lib/libc.so.6)
==3472237==    by 0x4E846A5: clone (in /usr/lib/libc.so.6)
==3472237==
src/ugrep.cpp

Looks like address (0x8e1f454) is the same for src and dst and the buffer just [needlessly] copied to itself, so this all perhaps harmless, but this hiccups valgrind and prevent its use for smoke testing on i586. Curiously, on x86_64 the condition is not detected.

With such test patch (to prove there is no other overlaps) error disappears:

diff --git a/ugrep/src/zstream.hpp b/ugrep/src/zstream.hpp
index a0dd262..5aa63e9 100644
--- a/ugrep/src/zstream.hpp
+++ b/ugrep/src/zstream.hpp
@@ -1580,7 +1580,8 @@ class zstreambuf : public std::streambuf {
     if (num > len)
       num = len;

-    memcpy(buf, buf_ + cur_, num);
+    if (buf != buf_ + cur_)
+        memcpy(buf, buf_ + cur_, num);

     cur_ += num;
genivia-inc commented 3 months ago

OK, thanks. Indeed, there is no real overlap, but the buffer may point to the same location after get_buffer() in ugrep:1164. So it copies to the same location. This is a bit of a dirty trick (I know what I'm doing) to eliminate unnecessary buffer allocations, since we already have a buffer so we don't need another one.

I prefer to use memmove() instead of a guard in this case:

    // move decompressed data to destination buf, which can be the same as the source buffer after get_buffer()
    memmove(buf, buf_ + cur_, num);
genivia-inc commented 3 months ago

I'll close this as resolved, since this is copy in place is harmless (our tests do not show any issues either) and the next release will have the patch in place.

vt-alt commented 3 months ago

Thank you!