Open hdhoang opened 2 years ago
Looks like they're using their own malloc
implementation:
$ readelf -aW clickhouse | rg "Symbol table|\b(malloc|realloc|free)\b"
Symbol table '.dynsym' contains 487796 entries:
173447: 0000000019a72b60 166 FUNC GLOBAL DEFAULT 15 free
235465: 0000000019a6e4c0 194 FUNC GLOBAL DEFAULT 15 malloc
434378: 0000000019a74620 7160 FUNC GLOBAL DEFAULT 15 realloc
So it's expected that it doesn't work. This kind of binaries are currently only supported for Rust programs using jemalloc.
If you'd recompile CH and disable this then profiling should work.
much thanks for the prompt response!
I will try rebuilding a CH with linkable malloc. so far their -D ENABLE_TCMALLOC=0
advice gives thread '<unnamed>' panicked at 'not yet implemented: _rjem_sallocx', preload/src/api.rs:743:5
under bytehound.
we'll try their guide next, and also try rebuilding without both TCMALLOC & their jemalloc
I will try rebuilding a CH with linkable malloc. so far their
-D ENABLE_TCMALLOC=0
advice givesthread '<unnamed>' panicked at 'not yet implemented: _rjem_sallocx', preload/src/api.rs:743:5
under bytehound.
Does this build use jemalloc
? Because it looks like it might. It looks like it's calling sallocx
, but I haven't implemented that one in Bytehound since Rust's jemallocator crate doesn't use it. It seems that you'll have to recompile CH so that it uses the system allocator instead. (Or the sallocx
could be implemented inside of Bytehound, but that's more effort.)
i guess so. here's a build with ENABLE_TCMALLOC=0 ENABLE_JEMALLOC=0
, it's a new error
bytehound: c7986 c7987 INF Writing initial header...
bytehound: c7986 c7987 INF Writing wall clock...
bytehound: c7986 c7987 INF Writing uptime...
bytehound: c7986 c7986 WRN Mapping 0x0000000000200000-0x0000000000201000 from '/root/ch-build/clickhouse' doesn't match any PT_LOAD entry
bytehound: c7986 c7986 WRN Duplicate PT_LOAD matches for a single memory region: Region { start: 406020096, end: 407764992, is_read: true, is_write: true, is_executable: false, is_shared: false, file_offset: 403910656, major: 9, minor: 1, inode: 167515591, name: "/root/ch-build/clickhouse" }
bytehound: c7986 c7987 INF Writing environ...
bytehound: c7986 c7986 WRN Match #0: LoadHeader { address: 405954816, file_offset: 403849472, file_size: 61632, memory_size: 61680, alignment: 4096, is_readable: true, is_writable: true, is_executable: false } => (AddressMapping { declared_address: 406016000, actual_address: 406020096, file_offset: 403910656, size: 1744896 }, LoadHeader { address: 405954816, file_offset: 403849472, file_size: 61632, memory_size: 61680, alignment: 4096, is_readable: true, is_writable: true, is_executable: false })
bytehound: c7986 c7986 WRN Match #1: LoadHeader { address: 406020608, file_offset: 403911168, file_size: 1743304, memory_size: 3882184, alignment: 4096, is_readable: true, is_writable: true, is_executable: false } => AddressMapping { declared_address: 406020096, actual_address: 406020096, file_offset: 403910656, size: 1744896 }
bytehound: c7986 c7987 INF Writing maps...
bytehound: c7986 c7987 INF Writing binaries...
bytehound: c7986 c7987 DBG Writing '/usr/lib/x86_64-linux-gnu/libresolv-2.31.so'...
bytehound: c7986 c7987 DBG Writing '/usr/lib/x86_64-linux-gnu/libc-2.31.so'...
fish: “env LD_PRELOAD=libbytehound.so…” terminated by signal SIGSEGV (Address boundary error)
here's how we build it in CH's builder image (i'll link their page here later):
v22.3.10.22-lts
, including correct submodulescd docker/packager/
# change defines here
export CMAKE_FLAGS="-DENABLE_JEMALLOC=0 -DENABLE_TCMALLOC=0"
#
./packager --cache ccache --output-dir /root/ch-build --package-type binary --compiler clang-13 --docker-image-version 39450-amd64
output dir will contain the big clickhouse
executable. we forgot to use --build-type debug
for this one though.
Okay, that's interesting. It is crashing, which definitely shouldn't happen.
This line might be related:
bytehound: c7986 c7986 WRN Mapping 0x0000000000200000-0x0000000000201000 from '/root/ch-build/clickhouse' doesn't match any PT_LOAD entry
bytehound: c7986 c7986 WRN Duplicate PT_LOAD matches for a single memory region: Region { start: 406020096, end: 407764992, is_read: true, is_write: true, is_executable: false, is_shared: false, file_offset: 403910656, major: 9, minor: 1, inode: 167515591, name: "/root/ch-build/clickhouse" }
It might be because it's all done under docker (I do somewhat support running under docker, but probably not every corner case), or it's because CH might have used a weird linker and/or linker args when linking itself.
You could maybe try to run it with MEMORY_PROFILER_USE_SHADOW_STACK=0
and then once it crashes load the coredump with gdb and see where exactly it SIGSEGVs.
under docker
we only build the binary in docker (base is ubuntu 20.04), to mimick CH's packaging method. we run the resulting file on-host (debian11). i too like to avoid any complication from docker infrastructure, if possible.
here i have a backtrace (under gdb shows the same, shadow-stack=on/off is the same). it's something to do with avx2 memset when defining(?) error codes. http://ix.io/47gW
it doesn't look bytehound-related somehow.
under gdb
I tried recording with rr too. under it, mmap perf_event_open
fails, so i don't think you're interested in an exported record.
sallocx could be implemented inside of Bytehound
would this be similar in shape to xallocx
(checking input, then call real jem_sallocx
)? how can we attempt to add it?
thanks in advance!
would this be similar in shape to xallocx (checking input, then call real jem_sallocx)? how can we attempt to add it?
Yes, call the real function and fixup the size (the allocations are bigger by default since they also contain a tag which Bytehound uses to track the allocations, but we can't let the original app know about this extra memory as it then might try to overwrite it)
here i have a backtrace (under gdb shows the same, shadow-stack=on/off is the same). it's something to do with avx2 memset when defining(?) error codes. http://ix.io/47gW
Assuming the backtrace is correct, it's possible it's related. If you look at the lower frames you can see that this crash happens before main
runs, which might mean that something weird is happening during static initialization.
I tried recording with rr too. under it, mmap perf_event_open fails, so i don't think you're interested in an exported record.
perf_event_open
is not necessary for correctness; it's just an optimization.
I'm also trying to test bytehound with ClickHouse and found exactly the same issues presented here. @hdhoang did you finally got it working?
sorry, we have no further progress. the CH people fixed our issue with jemalloc diff.
I tried adding sallocx
in a PR, but don't want to pursue that further until the bytehound refactor happens (there's a mention in another issue).
(apologies for sparse details, I'll copy them more as i can formulate them concretely)
we're trying out bytehound to analyze a memleak. However, the dat file always come out at 190MB, and have no update after CH finished starting up. Furthermore, the file has
0s
runtime,0B
allocation.We run on debian11, but CH builds are mostly self-contained. There's a few mode of starting up CH, you can try a one-shot command this way:
Later CH versions ignore
LD_PRELOAD
btw, so we'll have to use 22.3 here.Flushing
debug log:
final sql & bytehound output:
dat file is not updated beyond 2 seconds
and it contains no info
we would love to use bytehound's USR1 signaling to catch the relevant, long-term memleak. do you have any suggestion to gather more info?
we also tried
LD_PRELOAD
both bytehound.so & system libjemalloc.so.2 at the same time, but it segfaults/aborts (details TBD). further more, in interactive mode (egclickhouse local
without-q
), it aborts right away too.thanks for the tooling!