KhronosGroup / KTX-Software

KTX (Khronos Texture) Library and Tools
Other
879 stars 230 forks source link

Compressor leaks native memory when using Java interface on Windows/arm64 or Windows/x64 #720

Closed robnugent closed 1 year ago

robnugent commented 1 year ago

Take the following test program and run it on Windows/arm64 or Windows/x64. Within a small number of minutes the JVM will crash. The HotSpot dump file shows e.g. the following as the stack trace:

Stack: [0x000000ee1c400000,0x000000ee1c500000],  sp=0x000000ee1c4fe910,  free space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [KERNELBASE.dll+0x6536c]
C  [VCRUNTIME140.dll+0x6480]
C  [ktx.dll+0x161e13]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 964  org.khronos.ktx.KtxTexture2.compressAstcEx(Lorg/khronos/ktx/KtxAstcParams;)I libktx@4.2.0 (0 bytes) @ 0x000002e187c0d9f7 [0x000002e187c0d9a0+0x0000000000000057]
J 1072 c2 uk.co.abraded.toktx.ToKTXBug.convertToASTC(II)I uk.co.abraded.toktx (132 bytes) @ 0x000002e187c2764c [0x000002e187c27280+0x00000000000003cc]
j  uk.co.abraded.toktx.ToKTXBug.run()V+61 uk.co.abraded.toktx
j  java.lang.Thread.run()V+11 java.base@17.0.1
v  ~StubRoutines::call_stub

siginfo: EXCEPTION_UNCAUGHT_CXX_EXCEPTION (0xe06d7363), ExceptionInformation=0x0000000019930520 0x000000ee1c4fea70 0x00007ffdae248648 0x00007ffdae040000 

I'll try to append the whole dump file

Sometimes, there is a message to stderr saying something like 'malloc failed'.

Note the following:

1) Windows Task Manager show 'committed' memory spiralling during this test until the point of failure 2) It fails even if you adjust the test to just use one thread - it just takes longer.

package uk.co.abraded.toktx;

import java.text.DecimalFormat;
import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import org.khronos.ktx.KtxAstcParams;
import org.khronos.ktx.KtxCreateStorage;
import org.khronos.ktx.KtxErrorCode;
import org.khronos.ktx.KtxPackAstcBlockDimension;
import org.khronos.ktx.KtxPackAstcEncoderMode;
import org.khronos.ktx.KtxPackAstcQualityLevel;
import org.khronos.ktx.KtxTexture2;
import org.khronos.ktx.KtxTextureCreateInfo;
import org.khronos.ktx.VkFormat;

/**
 *
 * @author rob
 */
public class ToKTXBug implements Runnable {

    private static boolean nativeLoaded = false;
    private static final int NUM_THREADS = 16;
    private static final long ONE_HOUR_IN_MILLIS = 1 * 60 * 60 * 1000; // 1 Hour
    private static long tc = 0;
    private static final DecimalFormat DF2 = new DecimalFormat("00");

    private final Random r = new Random();

    private synchronized static long getTotalCount() {
        return tc++;
    }

    public static void main(String[] args) {
        System.out.println("Main() >");

        loadNative();

        final List<Thread> l = new ArrayList<>();

        for (int i = 0; i < NUM_THREADS; i++) {
            final ToKTXBug b = new ToKTXBug();
            final Thread t = new Thread(b, "Worker-" + DF2.format(i));
            t.setDaemon(false);
            t.start();
            l.add(t);
        }

        for (Thread t : l) {
            try {
                t.join();
                System.out.println("Joined: " + t.getName());
            } catch (InterruptedException ie) {
            }
        }

        System.out.println("Main() <");
    }

    public ToKTXBug() {
    }

    @Override
    public void run() {
        System.out.println(Thread.currentThread().getName() + " run() >");

        int c = 0;
        final long startTime = System.currentTimeMillis();

        // Repeatedly create and compress an image.
        while ((System.currentTimeMillis() - startTime) < ONE_HOUR_IN_MILLIS) {
            final int w = 1500 + (r.nextInt() % 500); // Random width
            final int h = w; // Height same as width
            final int size = convertToASTC(w, h);
            System.out.println(getTotalCount() + " : " + Thread.currentThread().getName() + " iteration: " + c++ + ", size: " + w + "x" + h + ", compressed data size is " + size);
        }
        System.out.println(Thread.currentThread().getName() + " run() < ");
    }

    private synchronized static void loadNative() {
        if (!nativeLoaded) {
            System.loadLibrary("ktx");
            System.loadLibrary("ktx-jni");
            nativeLoaded = true;
        }
    }

    public int convertToASTC(int w, int h) {
        loadNative();

        // Create Uncompressed texture
        final KtxTextureCreateInfo info = new KtxTextureCreateInfo();
        info.setBaseWidth(w);
        info.setBaseHeight(h);
        info.setVkFormat(VkFormat.VK_FORMAT_R8G8B8_SRGB); // Uncompressed
        final KtxTexture2 t = KtxTexture2.create(info, KtxCreateStorage.ALLOC);

        // Pass the uncompressed data
        int bufferSize = w * h * 3;
        final byte[] rgbBA = new byte[bufferSize];
        t.setImageFromMemory(0, 0, 0, rgbBA);

        // Compress the data
        final KtxAstcParams p = new KtxAstcParams();
        p.setBlockDimension(KtxPackAstcBlockDimension.D8x8);
        p.setMode(KtxPackAstcEncoderMode.LDR);
        p.setQualityLevel(KtxPackAstcQualityLevel.EXHAUSTIVE);
        final int rc = t.compressAstcEx(p);
        if (rc != KtxErrorCode.SUCCESS) {
            throw new RuntimeException("ASTC error " + rc);
        }
        final int retDataLen = (int) t.getDataSize();

        // Free things up - segfault usually occurs inside this destroy() call
        t.destroy();

        return retDataLen;
    }
}
robnugent commented 1 year ago

I believe the is likely the same issue that I reported in issue #690 and that the fix to that issue just changed the symptoms.

robnugent commented 1 year ago

This is using the 4.2.0 release.

robnugent commented 1 year ago

TaskManager

robnugent commented 1 year ago

Warning - the test case can hard-lock your PC to the point where the mouse wont move

robnugent commented 1 year ago

hs_err_pid9008.log

MarkCallow commented 1 year ago

Thank you for the report and sorry for the issue.

This is using the 4.2.0 release.

Please try 4.3.0-alpha1. Quite a bit of work has been done on robustness in libktx as part of implementing the new tools.

robnugent commented 1 year ago

@MarkCallow - thanks for the quick response.

Is there somewhere I can download builds of 4.3.0-alpha1 please? I could only locate 4.2.0 builds on the releases page...

If I need to build these myself, it might take me a while to get to this, as I was pretty incompetent at the build process when I tried this previously...

Thanks, Rob

robnugent commented 1 year ago

@MarkCallow Apologies - please ignore the above - I just realized that the 'ASSETS' tab on the 4.3.0-alpha1 build can be expanded and has the binaries.

robnugent commented 1 year ago

@MarkCallow OK - I just tried 4.3.0-alpha1 and the same problem exists both on Windows/arm64 and Windows/x64

robnugent commented 1 year ago

TaskManager-x64

robnugent commented 1 year ago

Tail of my output on Windows/arm64:

...
9536 : Worker-05 iteration: 548, size: 1760x1760, compressed data size is 774400
9537 : Worker-03 iteration: 598, size: 1743x1743, compressed data size is 760384
9538 : Worker-02 iteration: 598, size: 1886x1886, compressed data size is 891136
9539 : Worker-09 iteration: 604, size: 1032x1032, compressed data size is 266256
9540 : Worker-09 iteration: 605, size: 1339x1339, compressed data size is 451584
9541 : Worker-05 iteration: 549, size: 1134x1134, compressed data size is 322624
B:\svn\abraded\rob\code\trunk\src\modules\nbproject\build-impl.xml:1340: The following error occurred while executing this line:
B:\svn\abraded\rob\code\trunk\src\modules\nbproject\build-impl.xml:1024: Java returned: -1073741819
BUILD FAILED (total time: 1 minute 42 seconds)
MarkCallow commented 1 year ago

Thank you for the quick reply. We'll look into it.

MarkCallow commented 1 year ago

@wasimabbas-arm, in the case of success, ktxTexture2_CompressAstcEx does not appear to be freeing input_image, allocated between lines 667 and 676. It is only freed in case of failure. Please take a look.

@robnugent perhaps you can try freeing that, when you have time to set up a build yourself, and see if it fixes the problem.

robnugent commented 1 year ago

@MarkCallow - I just managed to build the code much more easily than I did last time.

Yes - this looks to be the problem. FYI: - I just added a line of code thus:

            if (work.error != ASTCENC_SUCCESS) {
                std::cout << "ASTC compressor failed\n" <<
                             astcenc_get_error_string(work.error) << std::endl;

                imageFree(input_image);

                astcenc_context_free(astc_context);
                return KTX_INVALID_OPERATION;
            }
>>>         imageFree(input_image)

My test has been running for about 20mins now, without any obvious signs of leaking memory.

wasimabbas-arm commented 1 year ago

@MarkCallow potential fix https://github.com/KhronosGroup/KTX-Software/pull/721

MarkCallow commented 1 year ago

v4.2.1 and v4.3.0-alpha2 have been released with the fix to free input image. I'm pretty sure this is the problem so I'm closing this. If the problem still occurs please reopen this, if you can, or open a new bug.

robnugent commented 1 year ago

@MarkCallow - I just tried 4.2.1 and can confirm that this issue is indeed fixed.

Thanks again for the fast response on this.

Rob