libnxz / power-gzip

POWER NX zlib compliant library
23 stars 18 forks source link

Make *Reset functions always restore NX state #188

Closed mscastanho closed 2 years ago

mscastanho commented 2 years ago

Please read commit messages for more details.

This is marked as draft because I haven't found a solution for #187 yet. This PR is affected by that issue in two situations:

  1. libnxz calls sw_deflateInit2_, then zlib calls deflateReset, which ends up calling libnxz's deflateReset.
  2. On deflateReset libnxz calls sw_deflateReset, which calls deflateResetKeep, which ends up using libnxz's deflateResetKeep.

The first one is easier to workaround because there won't be an entry in stream_map, so we now we have to call zlib's deflateReset. I couldn't find a way to detect calls from zlib for case 2 yet.

mscastanho commented 2 years ago

I believe the PR is now ready for review. Putting zlib in a different link namespace fixed issue #187, so I included it here. I also added tests for the reset functions to verify NX state is always being reset.

Note to reviewers: commit Make *Reset functions always restore NX state seems big, but the changes to nx_inflate.c and nx_deflate.c are pretty much the same.

mscastanho commented 2 years ago

Turns out test_stress.mix and some other tests are failing when libnxz is linked against older glibcs. It works fine with glibc 2.34 and later, but fails with glibc 2.32 (AT 14) and 2.31 (Debian 11). Bisect points to the commit "fixing" #187.

mscastanho commented 2 years ago

And I wonder if we need the map or it would work well just keeping the last entry.

@rzinsly I'm afraid we need the map for two reasons:

  1. If we create a single global var to save the last entry, all threads in a binary will point to the same field, so they could overwrite each other.
  2. Maybe we could save this global variable on TLS, but what if the user wants to maintain more than one stream at once in the same thread? Like for deflating and inflating different streams "at the same time" as it gradually receives data. In that case these multiple streams would all overwrite the same variable. Even though I haven't seen such an example yet, I wanted to avoid excluding that use case.
mscastanho commented 2 years ago

Turns out test_stress.mix and some other tests are failing when libnxz is linked against older glibcs. It works fine with glibc 2.34 and later, but fails with glibc 2.32 (AT 14) and 2.31 (Debian 11). Bisect points to the commit "fixing" https://github.com/libnxz/power-gzip/issues/187.

@rzinsly @tuliom I investigated further and this is what I found out: test_stress.sw and any other test that calls sw_deflateInit2_ in a multithreaded scenario is hitting a segfault on glibc's __get_nprocs. I bisected glibc and the commit that fixes it is:

commit b5eeca8cfd9d0fd92b5633a88901d9ff27f2b496
Author: Andreas Schwab <schwab@linux-m68k.org>
Date:   Tue Dec 8 19:17:41 2020 +0100

    Fix parsing of /sys/devices/system/cpu/online (bug 25859)

    The file contains comma-separated ranges, not spaces.

even though none of the hosts I tested the code on have commas and/or spaces in that file. This commit was included in glibc 2.33. The only distro with an older glibc that still passes the tests is the one from SLES15. Using Advance Toolchain 15 is enough to fix the issue as well. See below:

SLES15 (glibc 2.31): OK
RHEL8 (glibc 2.28): FAIL 
RHEL9 (glibc 2.34): OK
Ubuntu 22.04 (glibc 2.35): OK
Ubuntu 20.04 (glibc 2.31): FAIL
Debian 10 (glibc 2.28): FAIL
Debian 11 (glibc 2.31): FAIL
Advance Toolchain 14.0 (glibc 2.32): FAIL
Advance Toolchain 15.0 (glibc 2.34): OK

If I run the same test in containers (tested with RHEL8 and Debian 11) it works 🤷🏼 . Also note this only happens when glibc is in a separate link namespace (dlmopen(LM_ID_NEWLM,...). So I don't think we can ignore this issue, but at the same time I couldn't find a solution for this, or an alternative to solve #187. Ideas?

tuliom commented 2 years ago

I have just updated this PR with a slightly modified implementation. In this new implementation, I give a different interpretation on how deflateResetKeep() and inflateResetKeep() should behave. In his original proposal, @mscastanho interpreted that both these function should behave as their counterparts deflateReset() and inflateReset(). While I agree with his implementation and I believe that's the most logical too, it is not practical to move forward with it because of bug #187 . In this new interpretation deflateResetKeep() and inflateResetKeep() act as simpler reset functions, i.e. they reset only the stream being used without modifying the other one or recovering the NX stream after switching to software. This is important in order to avoid that zlib's deflate()/inflate() ends up calling libnxz's deflateResetKeep()/inflateResetKeep() which would recover the NX stream while zlib still expects its own stream there.

Note: I noticed a segmentation fault on RHEL 7, but I've been able to reproduce this issue with code already in develop.

tuliom commented 2 years ago

The *Reset_skeleton functions are not needed anymore but I'm fine with keeping them. LGTM

Let me remove them. I'm also planning to improve documentation. I think this will be important in order to explain the current decisions in this proposal.

tuliom commented 2 years ago

Done!

mscastanho commented 2 years ago

@tuliom The changes look good, I just didn't understand the reason for this one:

diff --git a/test/test_reset.c b/test/test_reset.c
index 4625f03..57069d0 100644
--- a/test/test_reset.c
+++ b/test/test_reset.c
@@ -25,16 +25,18 @@

 static void validate_stream(z_stream *strm, struct internal_state *sw_state,
                            nx_streamp hw_state) {
+#ifndef KEEP
        /* NX state should always be loaded */
        assert(has_nx_state(strm));

-       /* Pointers to internal structures shouldn't change */
-       nx_streamp s = (nx_streamp) strm->state;
-       assert(s == hw_state);
-       assert(s->sw_stream == sw_state);
-
        /* Should be ready for fallback */
-       assert(s->switchable == 1);
+       assert(hw_state->switchable == 1);
+#endif
+
+       if (hw_state->sw_stream) {
+               /* Pointers to internal structures shouldn't change */
+               assert(hw_state->sw_stream == sw_state);
+       }
 }

 int main()

The way it's currently written we're not checking that the currently loaded NX state is switchable, only the initial one (from before deflateReset), and we're also not checking that it didn't change.

tuliom commented 2 years ago

@mscastanho You're right. I shouldn't have changed the code path for deflateReset() and inflateReset(). Just for *ResetKeep().

While fixing this, I figured out the code setting hw_state and sw_state could be set closer to deflateInit() and inflateInit(). I also modified the code that zeroes strm in order to zero all the structure, instead of a few fields.

These are the new differences:

$ $ git diff  7ac8ea2 -- test/test_reset.c
diff --git a/test/test_reset.c b/test/test_reset.c
index 4625f03..e659696 100644
--- a/test/test_reset.c
+++ b/test/test_reset.c
@@ -25,6 +25,12 @@

 static void validate_stream(z_stream *strm, struct internal_state *sw_state,
                            nx_streamp hw_state) {
+#ifdef KEEP
+       if (hw_state->sw_stream) {
+               /* Pointers to internal structures shouldn't change */
+               assert(hw_state->sw_stream == sw_state);
+       }
+#else
        /* NX state should always be loaded */
        assert(has_nx_state(strm));

@@ -35,6 +41,7 @@ static void validate_stream(z_stream *strm, struct internal_state *sw_state,

        /* Should be ready for fallback */
        assert(s->switchable == 1);
+#endif
 }

 int main()
@@ -55,9 +62,9 @@ int main()
                return TEST_ERROR;
        }

-       strm.zalloc = Z_NULL;
-       strm.zfree = Z_NULL;
-       strm.opaque = (voidpf)0;
+       /* Ensure there is no garbage in the structure, which could cause
+          segmentation fault.  */
+       memset(&strm, 0, sizeof(z_stream));

        /* Reseting an uninitialized stream must fail */
        zcheck(CALL_DEFLATE_RESET(&strm), Z_STREAM_ERROR);
@@ -77,11 +84,12 @@ int main()
                        return TEST_SKIP;
                }

+               nx_streamp hw_state = (nx_streamp) strm.state;
+               struct internal_state *sw_state = hw_state->sw_stream;
+
                /* Reseting between init and deflate should be OK */
                zcheck(CALL_DEFLATE_RESET(&strm), Z_OK);

-               nx_streamp hw_state = (nx_streamp) strm.state;
-               struct internal_state *sw_state = hw_state->sw_stream;
                strm.next_in   = src;
                strm.next_out  = compr;
                strm.avail_in  = rounds[i] == SW ? 100 : src_len;
@@ -102,7 +110,10 @@ int main()
                        return TEST_SKIP;
                }

-               /* Reseting between init and deflate should be OK */
+               hw_state = (nx_streamp) strm.state;
+               sw_state = hw_state->sw_stream;
+
+               /* Reseting between init and inflate should be OK */
                zcheck(CALL_INFLATE_RESET(&strm), Z_OK);

                strm.next_in   = compr;
mscastanho commented 2 years ago

@tuliom @rzinsly Turns out I can't review my own PR 😂 The changes LGTM. So if it's OK with you, I think we can merge this.