libnxz / power-gzip

POWER NX zlib compliant library
23 stars 18 forks source link

git error using libnxz #181

Open brihh opened 2 years ago

brihh commented 2 years ago

are there any installed tools that can be used to demonstrate libnxz? git uses libz, so i tried it, but i get errors:

[root@p182n11 aa]# export LD_PRELOAD=/usr/lib64/libnxz.so [root@p182n11 aa]# ldd type -p git;time git clone https://github.com/libnxz/power-gzip linux-vdso64.so.1 (0x00007fff8cf80000) /usr/lib64/libnxz.so (0x00007fff8ca20000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fff8c960000) libz.so.1 => /lib64/libz.so.1 (0x00007fff8c920000) libpthread.so.0 => /lib64/glibc-hwcaps/power9/libpthread-2.28.so (0x00007fff8c8d0000) librt.so.1 => /lib64/glibc-hwcaps/power9/librt-2.28.so (0x00007fff8c8a0000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007fff8c520000) libc.so.6 => /lib64/glibc-hwcaps/power9/libc-2.28.so (0x00007fff8c310000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fff8c2e0000) /lib64/ld64.so.2 (0x00007fff8cfa0000) Cloning into 'power-gzip'... remote: Enumerating objects: 3908, done. remote: Counting objects: 100% (909/909), done. remote: Compressing objects: 100% (335/335), done. BUG: zlib.c:57: total_in mismatch fatal: fetch-pack: invalid index-pack output

real 0m3.941s user 0m0.011s sys 0m0.000s

[root@p182n11 aa]# unset LD_PRELOAD [root@p182n11 aa]# cd /lib64 [root@p182n11 lib64]# ll libz lrwxrwxrwx. 1 root root 19 Sep 24 2021 libzhuyin.so.13 -> libzhuyin.so.13.0.0 -rwxr-xr-x. 1 root root 1266536 Sep 24 2021 libzhuyin.so.13.0.0 lrwxrwxrwx. 1 root root 14 Apr 12 04:48 libz.so -> libz.so.1.2.11 lrwxrwxrwx. 1 root root 14 Apr 12 04:48 libz.so.1 -> libz.so.1.2.11 -rwxr-xr-x. 1 root root 136000 Apr 12 04:48 libz.so.1.2.11 lrwxrwxrwx. 1 root root 16 Jun 5 2020 libzstd.so -> libzstd.so.1.4.4 lrwxrwxrwx. 1 root root 16 Jun 5 2020 libzstd.so.1 -> libzstd.so.1.4.4 -rwxr-xr-x. 1 root root 796872 Jun 5 2020 libzstd.so.1.4.4 [root@p182n11 lib64]# mv libz.so.1.2.11 libz.so.1.2.11__orig [root@p182n11 lib64]# ln -s /usr/lib64/libnxz.so libz.so.1.2.11 [root@p182n11 lib64]# ll libz lrwxrwxrwx. 1 root root 19 Sep 24 2021 libzhuyin.so.13 -> libzhuyin.so.13.0.0 -rwxr-xr-x. 1 root root 1266536 Sep 24 2021 libzhuyin.so.13.0.0 lrwxrwxrwx. 1 root root 14 Apr 12 04:48 libz.so -> libz.so.1.2.11 lrwxrwxrwx. 1 root root 14 Apr 12 04:48 libz.so.1 -> libz.so.1.2.11 lrwxrwxrwx. 1 root root 20 Jun 22 07:02 libz.so.1.2.11 -> /usr/lib64/libnxz.so -rwxr-xr-x. 1 root root 136000 Apr 12 04:48 libz.so.1.2.11__orig lrwxrwxrwx. 1 root root 16 Jun 5 2020 libzstd.so -> libzstd.so.1.4.4 lrwxrwxrwx. 1 root root 16 Jun 5 2020 libzstd.so.1 -> libzstd.so.1.4.4 -rwxr-xr-x. 1 root root 796872 Jun 5 2020 libzstd.so.1.4.4 [root@p182n11 lib64]# cd - /opt/brianh/aa [root@p182n11 aa]# [root@p182n11 aa]# ldd type -p git;time git clone https://github.com/libnxz/power-gzip linux-vdso64.so.1 (0x00007fffa4f40000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fffa4980000) libz.so.1 => /lib64/libz.so.1 (0x00007fffa4920000) libpthread.so.0 => /lib64/glibc-hwcaps/power9/libpthread-2.28.so (0x00007fffa48d0000) librt.so.1 => /lib64/glibc-hwcaps/power9/librt-2.28.so (0x00007fffa48a0000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007fffa4520000) libc.so.6 => /lib64/glibc-hwcaps/power9/libc-2.28.so (0x00007fffa4310000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fffa42e0000) /lib64/ld64.so.2 (0x00007fffa4f60000) git: symbol lookup error: /lib64/libcrypto.so.1.1: undefined symbol: zError

this is on an E1080 w/ RHEL86 [root@p182n11 aa]# dnf list libnxz* Updating Subscription Management repositories. Red Hat Enterprise Linux 8 for Power, little endian - AppStream (RPMs) 88 kB/s | 4.5 kB 00:00
Red Hat Enterprise Linux 8 for Power, little endian - BaseOS (RPMs) 82 kB/s | 4.1 kB 00:00
Installed Packages libnxz.ppc64le 0.63-2.el8 @epel libnxz-devel.ppc64le 0.63-2.el8 @epel libnxz-static.ppc64le 0.63-2.el8 @epel [root@p182n11 aa]# rpm -qi libnxz Name : libnxz Version : 0.63 Release : 2.el8 Architecture: ppc64le Install Date: Tue 21 Jun 2022 12:00:25 PM EDT Group : Unspecified Size : 311418 License : ASL 2.0 or GPLv2+ Signature : RSA/SHA256, Thu 14 Apr 2022 06:51:58 PM EDT, Key ID 21ea45ab2f86d6a1 Source RPM : libnxz-0.63-2.el8.src.rpm Build Date : Thu 14 Apr 2022 06:39:09 PM EDT Build Host : buildvm-ppc64le-36.iad2.fedoraproject.org Relocations : (not relocatable) Packager : Fedora Project Vendor : Fedora Project URL : https://github.com/libnxz/power-gzip Bug URL : https://bugz.fedoraproject.org/libnxz Summary : Zlib implementation for POWER processors Description : libnxz is a zlib-compatible library that uses the NX GZIP Engine available on POWER9 or newer processors in order to provide a faster zlib/gzip compression without using the general-purpose cores.

tuliom commented 2 years ago

are there any installed tools that can be used to demonstrate libnxz?

@brihh You could try scp, e.g. LD_PRELOAD=/usr/lib64/libnxz.so scp -C /path/to/large/file my.server:/dev/null

Just keep in mind you may not get the throughput from the NX GZIP accelerator because it might be restricted by the network.

brihh commented 2 years ago

i dont see anything in the /tmp/nx.log that indicates that it's working.

[2022/06/22 15:19:28] pid 35005: nx_zlib.c:1222: Warning: libnxz loaded [2022/06/22 15:19:28] pid 35005: Info: software zlib version:1.2.11 nx-zlib log file: /tmp/nx.log nx-zlib config file: ./nx-zlib.conf env variables ============== NX_GZIP_VERBOSE: '2' NX_GZIP_TRACE: '8' nx-zlib config file ======== nx-zlib configuration ====== verbose: 2 dev_num: -1 page_sz: 65536 def_buf_size: 1048576 trace: 8 dht: 0 strategy: 1 mlock_csb: 0 timeout_pgfaults: 300 soft_copy_threshold: 1024 cache_threshold: 8192 [2022/06/22 15:19:28] pid 35006: nx_zlib.c:1222: Warning: libnxz loaded [2022/06/22 15:19:28] pid 35006: Info: software zlib version:1.2.11 [2022/06/22 15:20:48] pid 35005: ### API call statistic: [2022/06/22 15:20:48] pid 35005: ### deflateInit: 0 [2022/06/22 15:20:48] pid 35005: ### deflate: 0 [2022/06/22 15:20:48] pid 35005: ### deflate(sw): 0 [2022/06/22 15:20:48] pid 35005: ### deflate(nx): 0 [2022/06/22 15:20:48] pid 35005: ### deflateBound: 0 [2022/06/22 15:20:48] pid 35005: ### deflateEnd: 0 [2022/06/22 15:20:48] pid 35005: ### compress: 0 [2022/06/22 15:20:48] pid 35005: ### inflateInit: 0 [2022/06/22 15:20:48] pid 35005: ### inflate: 0 [2022/06/22 15:20:48] pid 35005: ### inflate(sw): 0 [2022/06/22 15:20:48] pid 35005: ### inflate(nx): 0 [2022/06/22 15:20:48] pid 35005: ### inflateEnd: 0 [2022/06/22 15:20:48] pid 35005: ### uncompress: 0 [2022/06/22 15:20:48] pid 35005: ### deflate data length: 0 KiB [2022/06/22 15:20:48] pid 35005: ### deflate time: 0.00 secs [2022/06/22 15:20:48] pid 35005: ### deflate rate: nan MiB/s [2022/06/22 15:20:48] pid 35005: ### inflate data length: 0 KiB [2022/06/22 15:20:48] pid 35005: ### inflate time: 0.00 secs [2022/06/22 15:20:48] pid 35005: ### inflate rate: nan MiB/s [2022/06/22 15:20:48] pid 35005: ### nx_devices[0].open_cnt 0

[root@p182n11 Arcturas]# export LD_PRELOAD=/usr/lib64/libnxz.so
[root@p182n11 Arcturas]# ldd type -p scp;time scp -C aa.zip root@p182n11:/tmp linux-vdso64.so.1 (0x00007fffa80d0000) /usr/lib64/libnxz.so (0x00007fffa8030000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007fffa7ca0000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fffa7c70000) libutil.so.1 => /lib64/libutil.so.1 (0x00007fffa7c40000) libz.so.1 => /lib64/libz.so.1 (0x00007fffa7c00000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fffa7bb0000) libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fffa7b70000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fffa7b10000) libc.so.6 => /lib64/glibc-hwcaps/power9/libc-2.28.so (0x00007fffa7900000) libpthread.so.0 => /lib64/glibc-hwcaps/power9/libpthread-2.28.so (0x00007fffa78b0000) /lib64/ld64.so.2 (0x00007fffa80f0000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fffa7800000) aa.zip 100% 2011MB 25.2MB/s 01:19

real 1m20.289s user 1m14.573s sys 0m5.897s

tuliom commented 2 years ago

@brihh The statistics are not working.

I see you're transferring an already compressed file. Adding another layer of compression may not help. I suggest that you:

  1. Transfer an uncompressed file that has at least hundreds of MB in size.
  2. Target /dev/null in order to avoid limitations in the server disk.

If you prefer, it's also possible to replace the target with localhost:/dev/null. In any case, the time spent will be very different with and without NX GZIP.

brihh commented 2 years ago

nope, still not seeing it.

[root@p182n11 aa]# ll e.tar;file e.tar -rw-r--r--. 1 root root 5464627200 Jun 22 16:20 e.tar e.tar: POSIX tar archive (GNU) [root@p182n11 aa]# unset LD_PRELOAD [root@p182n11 aa]# ldd type -p ssh;time scp -C e.tar localhost:/dev/null linux-vdso64.so.1 (0x00007fffb81a0000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007fffb7d00000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fffb7cd0000) libutil.so.1 => /lib64/libutil.so.1 (0x00007fffb7ca0000) libz.so.1 => /lib64/libz.so.1 (0x00007fffb7c60000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fffb7c10000) libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fffb7bd0000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fffb7b70000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007fffb7ae0000) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007fffb79a0000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007fffb7960000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007fffb7930000) libc.so.6 => /lib64/glibc-hwcaps/power9/libc-2.28.so (0x00007fffb7720000) libpthread.so.0 => /lib64/glibc-hwcaps/power9/libpthread-2.28.so (0x00007fffb76d0000) /lib64/ld64.so.2 (0x00007fffb81c0000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fffb7620000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007fffb75e0000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007fffb75b0000) e.tar 100% 5211MB 22.0MB/s 03:57

real 3m57.416s user 3m52.253s sys 0m5.960s [root@p182n11 aa]# export LD_PRELOAD=/usr/lib64/libnxz.so [root@p182n11 aa]# ldd type -p ssh;time scp -C e.tar localhost:/dev/null linux-vdso64.so.1 (0x00007fffa0f90000) /usr/lib64/libnxz.so (0x00007fffa0e20000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x00007fffa0a90000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fffa0a60000) libutil.so.1 => /lib64/libutil.so.1 (0x00007fffa0a30000) libz.so.1 => /lib64/libz.so.1 (0x00007fffa09f0000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fffa09a0000) libresolv.so.2 => /lib64/libresolv.so.2 (0x00007fffa0960000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fffa0900000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007fffa0870000) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007fffa0730000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007fffa06f0000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007fffa06c0000) libc.so.6 => /lib64/glibc-hwcaps/power9/libc-2.28.so (0x00007fffa04b0000) libpthread.so.0 => /lib64/glibc-hwcaps/power9/libpthread-2.28.so (0x00007fffa0460000) /lib64/ld64.so.2 (0x00007fffa0fb0000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fffa03b0000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007fffa0370000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007fffa0340000) e.tar 100% 5211MB 21.2MB/s 04:06

real 4m6.374s user 3m56.145s sys 0m11.021s

brihh commented 2 years ago

rebooting fixed the scp issue - seems that the udev change wasn't taking effect until after a reboot.

brihh commented 2 years ago

tho i still see the git error, as well as issues with other commands when using the LD_PRELOAD.

mscastanho commented 2 years ago

(I did some early investigation yesterday. I'll write down the steps to debug this in case it's useful for anyone else later).

I believe there are multiple issues here. Trying to debug a git clone call is difficult, as there are several fork+exec calls and the issue happens in one of them. I used the following command to get a better notion of what was going on:

$ GIT_TRACE=2 GIT_TRACE_PACK_ACCESS=true LD_PRELOAD=$HOME/src/power-gzip/lib/.libs/libnxz.so git clone https://github.com/libnxz/power-gzip
15:48:22.706391 git.c:444               trace: built-in: git clone https://github.com/libnxz/power-gzip
Cloning into 'power-gzip'...
15:48:22.739301 run-command.c:664       trace: run_command: git remote-https origin https://github.com/libnxz/power-gzip
15:48:22.741725 git.c:730               trace: exec: git-remote-https origin https://github.com/libnxz/power-gzip
15:48:22.741750 run-command.c:664       trace: run_command: git-remote-https origin https://github.com/libnxz/power-gzip
15:48:23.209600 run-command.c:664       trace: run_command: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 1689386 on laurel' --check-self-contained-and-connected
15:48:23.212112 git.c:444               trace: built-in: git index-pack --stdin -v --fix-thin '--keep=fetch-pack 1689386 on laurel' --check-self-contained-and-connected
remote: Enumerating objects: 3984, done.
remote: Counting objects: 100% (949/949), done.
remote: Compressing objects: 100% (378/378), done.
fatal: Out of memory, malloc failed (tried to allocate 300834186225 bytes)
fatal: index-pack failed

I isolated the git index-pack call by copying a pack-file from a git repo (located at <repo>/.git/objects/pack/) and calling the command on it.

LD_PRELOAD=$HOME/src/power-gzip/lib/.libs/libnxz.so git index-pack -o test.idx pack-9ac9e3b43c602a09ea8a2bcf2137d50487aed745.pack

git loops through all objects in the pack using parse_pack_objects, which parses their metadata in unpack_raw_entry and then decompresses their contents in unpack_entry_data. This is where the zlib calls happen. On my test git fails when trying to unpack the very first object. Some details:

before #1 inflate call:
avail_in: 4082
avail_out: 318 <-- expected size
flush: 0 (Z_NO_FLUSH)

after #1 inflate call:
avail_in: 0
avail_out: 318 <-- nothing generated
used_in: 4080 <-- 2 bytes not cached correspond to zlib header
used_out: 0
total_in: 4082
total_out: 0
return: Z_OK

after #1 inflate call (using zlib):
avail_in: 3868
avail_out: 0
total_in: 214
total_out: 318
return: Z_STREAM_END

Because libnxz tries to cache data if Z_FINISH is not given we delay the processing of the data, even though what we currently have already contains the entire stream. inflate returns with Z_OK and git keeps calling it. zlib on the other hand only consumes the exact amount of data needed to inflate the stream, and immediately spits out the decompressed data.

Simply disabling (commenting out) the code where we try to cache data on inflate is enough to have the same behavior as zlib in the example above.

With this workaround in place, I tried running git clone again:

$ LD_PRELOAD=$HOME/src/power-gzip/lib/.libs/libnxz.so git clone https://github.com/libnxz/power-gzip
Cloning into 'power-gzip'...
remote: Enumerating objects: 3984, done.
remote: Counting objects: 100% (985/985), done.
remote: Compressing objects: 100% (382/382), done.
fatal: pack has bad object at offset 2816068: inflate returned -5
fatal: fetch-pack: invalid index-pack output

So there's still more going on. I found out about GIT_TRACE_PACKFILE envvar and was able to get the pack file that is causing the issue to facilitate debugging.

# Got the file using zlib and tested with libnxz
$ GIT_TRACE_PACKFILE=/tmp/libnxz.pack GIT_TRACE=2 GIT_TRACE_PACK_ACCESS=true git clone https://github.com/libnxz/power-gzip
$ GIT_TRACE=2 GIT_TRACE_PACK_ACCESS=true LD_PRELOAD=$HOME/src/power-gzip/lib/.libs/libnxz.so git index-pack -o test2.idx /tmp/libnxz.pack
08:50:57.905215 git.c:460               trace: built-in: git index-pack -o test2.idx /tmp/libnxz.pack
fatal: pack has bad object at offset 2822583: inflate returned -5

This is a very specific case, in which there's a pack object of size 0 (why??), so inflate is called with avail_out = 0.

before inflate call:
avail_in: 1979
avail_out: 0
total_in: 0
total_out: 0

after inflate call (using libnxz):
avail_in: 1977
avail_out: 0
total_in: 2   <----- only 2 bytes consumed, correspond to zlib header
total_out: 0
return: Z_BUF_ERROR

after inflate call (using zlib):
avail_in: 1971
avail_out: 0
total_in: 8      <---- probably zlib header + empty block + checksum
total_out: 0
return: Z_STREAM_END

Note the "compressed object" has size 0, but the stream containing it does not because there's some overhead from the zlib format. So zlib actually processes 8 bytes, generating 0 decompressed bytes. Since nothing was generated, it doesn't complaing about avail_out being 0. This is not true for libnxz, we're exiting early here: https://github.com/libnxz/power-gzip/blob/e1d0b1c7b0df47acdbcc8e3c5fb0ca4d9f28f018/lib/nx_inflate.c#L1004-L1005

Even if we remove this an another early check, passing the empty stream to NX yields a CC=3. I still need to investigate more to understand what's going on.