Closed feuler closed 4 months ago
Nevermind. Works when i use GLIBC_2.17 instead of GLIBC_2.2.5
@feuler Glad you found a solution!
Did you amend nvshare's code or did the problem stem from your configuration?
@feuler Glad you found a solution!
Did you amend nvshare's code or did the problem stem from your configuration?
I changed some lines since my aarch64 linux default libc.so.6 doesn't have GLIBC_2.2.5 as mentioned in my first post.
Changed line 390 in src/hook.c to: r_dlsym = (dlsym_t*)dlvsym(RTLD_NEXT, "dlsym", "GLIBC_2.17");
Line 974 in src/hook.c to:
asm(".symver dlsym_225, dlsym@@GLIBC_2.17");
And all "GLIBC_2.2.5" in src/libnvshare-symbols.ld to "GLIBC_2.17"
Anyway, scheduler and LD_PRELOAD was working but failed to catch and transform all cudaMalloc calls to cudaMallocManaged (got cuda out of memory again). But in the meanwhile i was able to solve my problem by patching the used application for cudaMallocManaged.
I think you were getting out of memory because nvshare by default limits the memory each process can allocate to the physical GPU memory size:
https://github.com/grgalex/nvshare/blob/8a15dd1094a91678bf9cdf07e5951f6b28c01cf0/src/hook.c#L663
So it might be the case that your application was trying to allocate more GPU memory than the physical GPU memory size.
You can try running your application again with the environment variable NVSHARE_ENABLE_SINGLE_OVERSUB=1
set.
The ubuntu 22.04. aarch64 i use doesnt have GLIBC_2.2.5 I get this error when trying to run with LD_PRELOAD: [NVSHARE][FATAL]: libnvshare.so: undefined symbol: dlsym, version GLIBC_2.2.5
Is it possible to make it work on aarch64 ?
strings /usr/lib/aarch64-linux-gnu/libc.so.6 | grep GLIBC
GLIBC_2.17 GLIBC_2.18 GLIBC_2.22 GLIBC_2.23 GLIBC_2.24 GLIBC_2.25 GLIBC_2.26 GLIBC_2.27 GLIBC_2.28 GLIBC_2.29 GLIBC_2.30 GLIBC_2.31 GLIBC_2.32 GLIBC_2.33 GLIBC_2.34 GLIBC_2.35 GLIBC_PRIVATE GNU C Library (Ubuntu GLIBC 2.35-0ubuntu3.8) stable release version 2.35