KhronosGroup / KTX-Software

KTX (Khronos Texture) Library and Tools
Other
864 stars 228 forks source link

Heap corruption opening some ktx2 files. #643

Closed bluescan closed 1 year ago

bluescan commented 1 year ago

The ktx2 files causing the issue were generated from the latest NVidia Texture Tools Generator and can be found here: https://github.com/bluescan/tacent/tree/master/UnitTests/TestData/Images/KTX2

Specifically R8_A.ktx2 and B8G8R8_RGB.ktx2 (they are both 2D, with mipmaps, and supercompression). I'm not sure how 'well-formed' the files are as NVTT is closed source, but I figure LibKTX shouldn't crash in any case. The other test files in that folder seem to work fine. They were all generated using NV texture-tools exporter (standalone), just with different pixel formats.

As of LibKTX v4.1.0-rc3-8-g691e9ca3-dirty (quite recent), using the KTX_TEXTURE_CREATE_LOAD_IMAGE_DATA_BIT flag causes heap corruption (noticed by VS 2022) when calling ktxTexture_Destroy.

HEAP CORRUPTION DETECTED: after Normal block (#179) at 0x0000019C24BCA070.
CRT detected that the application wrote to memory after end of heap buffer.

Some minimal code to repo:

    ktxTexture* tex = nullptr;
    ktx_error_code_e result = ktxTexture_CreateFromNamedFile("R8_A.ktx2", KTX_TEXTURE_CREATE_LOAD_IMAGE_DATA_BIT, &tex);
    if ((result == KTX_SUCCESS) && tex)
        ktxTexture_Destroy(tex);   //<< Detected here

I just put this that code at the top of toktxApp::main

The corruption seems to be with the pixel data:

    if (This->kvData != NULL)
        free(This->kvData);
    if (This->pData != NULL)
        free(This->pData);      ///<<<HERE
    free(This->_protected);
}

Thx

bluescan commented 1 year ago

If helpful, the A8_A file is set as VK_FORMAT_R8_UNORM (cuz there is no other 8bit single channel Vulkan format), and the B8G8R8 file is set as VK_FORMAT_B8G8R8_SRGB by the NVidia exporter.

MarkCallow commented 1 year ago

Have you run ktx2check on the files to check their validity? If not, please do so.

From what source file is the snippet where the corruption happens?

Does the release version of your app crash or does the crash only happen with the debug CRT?

Please build a debug version of vkloadtests in VS2022 and then run vkloadtests <your file>, which should display your file, and see if the heap corruption is detected.

MarkCallow commented 1 year ago

Never mind about item 1. I've just done it. A8_A.ktx2 is not well formed. B8G8R8_RGB.ktx2 is. Here is what ktx2check says about A8_A.ktx2:

Issues in: A8_A.ktx2
    ERROR: DFD bytesPlane0 must be non-zero for non-supercompressed
    VK_FORMAT_R8_UNORM texture.
    ERROR: DFD says sRGB but vkFormat is not an sRGB format.

Please report this to NVIDIA and ask them to fix their tool.

EDIT: The file is zstd compressed so bytesPlane0 == 0 is valid. Looks like this file is causing a problem for ktx2check as well. However the report of the DFD issue is correct.

bluescan commented 1 year ago

I'll post to the nVidia texture tools dev forum regarding the A8_A.ktx2 -- pretty easy repo steps... and will stop using it as a test image. B8G8R8_RGB still crashes.

As for the source file -- I don't know where the corruption happens, but it is detected in texture.c during cleanup.

void
ktxTexture_destruct(ktxTexture* This)
{
    ktxStream stream = *(ktxTexture_getStream(This));

    if (stream.data.file != NULL)
        stream.destruct(&stream);
    if (This->kvDataHead != NULL)
        ktxHashList_Destruct(&This->kvDataHead);
    if (This->kvData != NULL)
        free(This->kvData);
    if (This->pData != NULL)
        free(This->pData);       // <<<<Here. Line 237
    free(This->_protected);
}

Line 237.

As for release vs debug... I was building a static release lib for use by tacentview and noticed the crash. I then just cloned KTX-Software as-is and ran the VS generator cmake . -A x64 -B builddebug -G "Visual Studio 17 2022" -DCMAKE_BUILD_TYPE=Debug and then added the test-code above and got the same issue. It only detects the overwrite when CMAKE_BUILD_TYPE=Debug, but it's likely still there in Release.

Haven't figured out how to build vkloadtests yet, but can report back once I do.

bluescan commented 1 year ago

FYI https://forums.developer.nvidia.com/t/ill-formed-ktx2-file-generated/231805 regarding the DFD sRGB not matching the VK_FORMAT.

MarkCallow commented 1 year ago

I understand the cause of the incorrect flagging of bytesPlane0. The incorrect DFD (with the SRGB mismatch) causes a detailed analysis to be run in which there is a slight flaw. See #644.

I'll try to reproduce the crash next.

MarkCallow commented 1 year ago

I can't reproduce this with vkloadtests on macOS/Xcode with all memory diagnostics turned on. This may well be because vkloadtests does not use KTX_TEXTURE_CREATE_LOAD_IMAGE_DATA_BIT rather than any difference in debug memory diagnostics. I am interested to hear if you can reproduce it using VS2022.

I'll be away for the next couple of days so there'll be no more responses from me for a while.

bluescan commented 1 year ago

No worries. Messing with this stuff is mostly a weekend project for me. I can say that without the KTX_TEXTURE_CREATE_LOAD_IMAGE_DATA_BIT it doesn't crash on VS... but I need that bit since it's a viewer that can optionally edit the pixel-data. It might also be worth trying to put the minimal repo code (with correct path to the B8G8R8 file) at top of toktxApp::main and see what happens on macOS.

bluescan commented 1 year ago

image But, like you mentioned, it's not using KTX_TEXTURE_CREATE_LOAD_IMAGE_DATA_BIT.

If I change KTX_TEXTURE_CREATE_NO_FLAGS to KTX_TEXTURE_CREATE_LOAD_IMAGE_DATA_BIT in InstancedSampleBase::InstancedSampleBase (line 70 InstancedSampleBase.cpp), I get the heap corruption again.

MarkCallow commented 1 year ago

I made the same change in my vkloadtests. I am not able to reproduce the heap corruption. None is detected by the xcode memory management diagnostics and there is no crash. This is true with either of the files. One thing I did notice is that the pData pointer is not cleared inktxTexture_destruct so if it is called twice free will be called twice for the same block of memory. If that was happening I'd expect a different corruption message from VS2022, e.g. freeing already free memory.

Can you do more debugging?

Edited to add ...

In vkloadtests the ktxTexture is destructed after the ktxTexture_VkUpload and before the texture is displayed on screen. So even with VS2022 detecting corruption the app didn't crash. Also, in case you start setting breakpoints in ktxTexture_destruct, note that vkloadtests when in display file mode calls ktxTexture_CreateFromNamedFile and ktxTexture_destruct twice. The first time is to determine the structure of the texture (2D, array, cubemap, etc). The second time is to load and display the texture in the tool selected by the first step. You and I both modified the second call. The first does not set the LOAD_IMAGE_BIT.

bluescan commented 1 year ago

Yeah... I was wondering about the double-call -- that explains it.

I did a bit more poking around. The good news is it appears to be just overwriting the end of the buffer... and it doesn't seem to be by much. Unfortunately I'm not familiar enough with the codebase to offer the final solution, but here's what I did (it's a bit messy, but it's just debugging)

In texture2.c where it mallocs the mem in ktxTexture2_LoadImageData I added some extra room +8:

        This->pData = malloc(inflatedDataCapacity + 8);
        printf("pData Start: %p\n", This->pData);
        printf("pData   End: %p\n", This->pData+inflatedDataCapacity);
        invalidDestPointer = This->pData + inflatedDataCapacity;

invalidDestPointer is just a global uin8* I added that I could extern elsewhere... it's the address of the first byte after the original malloc size (not to be written to.. but is ok cuz of the +8). Then I set a data/memory breakpoint to see if anyone was writing to that address and got a hit.

The hit was in zstd.c in ZSTD_copy8 (ln 7657). I then added a bit of printf debugging there:

static void ZSTD_copy8(void* dst, const void* src) {
#if !defined(ZSTD_NO_INTRINSICS) && defined(__ARM_NEON)
    vst1_u8((uint8_t*)dst, vld1_u8((const uint8_t*)src));
#else

    if (((unsigned char*)dst + 7) >= invalidDestPointer)
    {
        printf("dst: %p  srcSize: %d\n", dst, 8);
    }

    ZSTD_memcpy(dst, src, 8);
#endif
}

And the output was:

pData Start: 0000023FC4B78070
pData   End: 0000023FC4EFC01C
dst: 0000023FC4EFC015  srcSize: 8

the dst (C015 plus 7 more bytes gets us to C01C, which we're not supposed to write to) -- also no crash by adding the hardcoded extra to the malloc (while using KTX_TEXTURE_CREATE_LOAD_IMAGE_DATA_BIT in the loadtest)

(I'm a little surprised msvc was able to detect such a small overrun.. maybe things just line up nicely for some files/dimensions)

I'm not sure if it's an out-by-1 error somewhere in the uncompressed size computation, or if the zstd stuff needs a bit of buffer at the end cuz it's done in 'chunks'... but anyway, just adding those few extra bytes to the malloc has so far solved it... I just don't think that's the right place to put the final fix ;)

"Programming bugs generally fall into 2 main categories: input validation issues, thread synchronization bugs, and out-by-1 errors."

MarkCallow commented 1 year ago

Thanks for the useful information.

This line in ktxtexture2_LoadImageData may be the key. https://github.com/KhronosGroup/KTX-Software/blob/18342a68d0e8f1e4b0062a671ecbf64108f7f9fb/lib/texture2.c#L2196

inflatedDataSize is used to malloc the output buffer for the zstd decompression.

The calculation is done by summing the uncompressed byte lengths for each level given in the levelIndex of the file. So if you only see the problem with certain files, maybe the uncompressed level sizes are incorrect in those files. I haven't time to check them right now. They're easily calculated from the format and the pixel size.

I'm not ready to rule out a problem in KTX-Software.

By the way, the reason no corruption is detected when LOAD_IMAGE_DATA_BIT is not used is because in that case the data is loaded directly into a staging buffer allocated by Vulkan rather than into the heap.

MarkCallow commented 1 year ago

I found that ktx2check was not comparing uncompressedByteLengths with the value expected from the VkFormat for any supercompressed ktx2 file. I've modified it to do so when there is a valid VkFormat such as is the case with zstd supercompression. The 2 files in question here have valid uncompressedByteLengths for each level.

I'll have to look more closely at that zstd code. Please provide the stack trace down to ZSTD_copy8. It is one of the callers of that that is suspicious.

MarkCallow commented 1 year ago

I modified the zstd inflation code at starting at line 2348 in lib/texture2.c to add some checks of the inflated level sizes returned by zstd and the calculated level sizes used to calculate the destination buffer size.

    ktx_size_t idcStart = inflatedDataCapacity;
    dctx = ZSTD_createDCtx();
    for (int32_t level = This->numLevels - 1; level >= 0; level--) {
        size_t levelByteLength =
            ZSTD_decompressDCtx(dctx, pInflatedData + levelOffset,
                              inflatedDataCapacity,
                              &pDeflatedData[cindex[level].byteOffset],
                              cindex[level].byteLength);
        ktx_size_t calcLevelByteLength = ktxTexture_calcLevelSize((ktxTexture*)This, level, KTX_FORMAT_VERSION_TWO);
        if (ZSTD_isError(levelByteLength)) {
            ZSTD_ErrorCode error = ZSTD_getErrorCode(levelByteLength);
            switch(error) {
              case ZSTD_error_dstSize_tooSmall:
                return KTX_INVALID_VALUE; // inflatedDataCapacity too small.
              case ZSTD_error_memory_allocation:
                return KTX_OUT_OF_MEMORY;
              default:
                return KTX_FILE_DATA_ERROR;
            }
        }
        assert(calcLevelByteLength == levelByteLength);
        nindex[level].byteOffset = levelOffset;
        nindex[level].uncompressedByteLength = nindex[level].byteLength =
                                                            levelByteLength;
        ktx_size_t paddedLevelByteLength
              = _KTX_PADN(uncompressedLevelAlignment, levelByteLength);
        inflatedByteLength += paddedLevelByteLength;
        levelOffset += paddedLevelByteLength;
        inflatedDataCapacity -= paddedLevelByteLength;
        assert(inflatedDataCapacity < idcStart);
    }

I also noticed and fixed in the above an error in the calculation of inflatedByteLength and inflatedDataCapacity. This error resulted in ZSTD_decompressDCtx being told there was more space in the buffer than actually available so could be why zstd overran the buffer. Therefore I am very interested to hear your result with this fix.

I loaded B8G8R8_RGB.ktx2. The levelByteLength values returned by zstd exactly match the expected values. They also did before I added the fix.

bluescan commented 1 year ago

Thanks. I'll give it a go and get back...

bluescan commented 1 year ago

Super! Yeah -- I think that was exactly where the overrun was coming from... cuz I'm having no issues after applying your texture2.c changes above. It looks like the decompressor was using a more efficient block memcpy if there was room available (i.e. if it thought there was room available ;) )

I've tested converting the uncompressed texture data into rgba and saving as tga. All good for all the BC formats, the two files above (A8_A.ktx2 and B8G8R8_RGB.ktx2), plus tested a bunch of floating-point formats.

Thanks for fixing this.

MarkCallow commented 1 year ago

Terrific. I'm delighted it fixed the problem.