LWJGL / lwjgl3

LWJGL is a Java library that enables cross-platform access to popular native APIs useful in the development of graphics (OpenGL, Vulkan, bgfx), audio (OpenAL, Opus), parallel computing (OpenCL, CUDA) and XR (OpenVR, LibOVR, OpenXR) applications.
https://www.lwjgl.org
BSD 3-Clause "New" or "Revised" License
4.67k stars 631 forks source link

Core Dump with Vma on 3.3.2-SNAPSHOT #821

Closed Illithidek closed 1 year ago

Illithidek commented 1 year ago

Version

3.3.2 (nightly)

Platform

Linux x64

JDK

openjdk-16.0.1

Module

VMA

Bug description

Currently with lwjgl 3.3.2-SNAPSHOT, I get core dump when using VMA(specifically vmaCreateBuffer). The thing is when buffer size is big enough core dump seems to dissapear. From what I seen problem occurs for sizes like <35 MB, for 1 byte it will always appear. Below I provide a code snippet:

// we pass to this function succesfully created vma allocator with vulkan device
private fun createBuffer(
    device: VulkanDevice,
    allocationSize: Long,
    usage: Int,
    flags: Int = 0,
    memoryUsage: ResourceMemoryUsage = ResourceMemoryUsage.Auto,
    sharingMode: Int = VK_SHARING_MODE_EXCLUSIVE
  ): VulkanBufferBaseData {
    MemoryStack.stackPush().use { stack ->
      val bufferCreateInfo = VkBufferCreateInfo.calloc()
        .sType(VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO)
        .size(allocationSize)
        .usage(usage)
        .sharingMode(sharingMode)

      val vma = VmaAllocationCreateInfo.calloc(stack)
        .usage(memoryUsage.toVmaMemoryUsage())
        .flags(flags)

      val pBuffer = stack.mallocLong(1)
      val pAllocation = stack.mallocPointer(1)

      val vmaAllocInfo = VmaAllocationInfo.calloc()
      vmaCreateBuffer(
        device.vulkanMemoryAllocator.allocatorPointer,
        bufferCreateInfo,
        vma,
        pBuffer,
        pAllocation,
        vmaAllocInfo
      )
  // Core dump appear for allocationSize like 25108864
}

Thanks in advance for any help!

Stacktrace or crash log output

Stack: [0x00007fb035d00000,0x00007fb035e00000],  sp=0x00007fb035dfb308,  free space=1004k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x154289]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  org.lwjgl.util.vma.Vma.nvmaCreateBuffer(JJJJJJ)I+0
j  org.lwjgl.util.vma.Vma.vmaCreateBuffer(JLorg/lwjgl/vulkan/VkBufferCreateInfo;Lorg/lwjgl/util/vma/VmaAllocationCreateInfo;Ljava/nio/LongBuffer;Lorg/lwjgl/PointerBuffer;Lorg/lwjgl/util/vma/VmaAllocationInfo;)I+47
Spasi commented 1 year ago

Hey @Illithidek,

I cannot reproduce this, tried both Linux and Windows. There must be something else causing the crash. Could you prepare an MCVE please?

Btw, don't forget to pass stack to the struct allocation functions, else you end up with real allocations that need to be explicitly freed.

Illithidek commented 1 year ago

@Spasi Thanks for your concern with the freeing. I was changing the code looking for an error and then copied "changed" version here ;(. Anyway this is not the cause.

The strange thing is, everything works for me with llwjgl 3.3.2 except vma. I even did a test, added as a dependency all packages from 3.3.2 except vma, which was based on 3.3.1, and it worked.

Anyway I will try provide MCVE, as soon as I can.

Illithidek commented 1 year ago

@Spasi firstly here is the sample project with core dump. I tried to clean it up as much as I could. MCVE-lwjgl-bug.zip

Also I don't know if this could be the thing but I detected some problem with dependencies. Long short story - I downloaded lwjgl3-demos(https://github.com/LWJGL/lwjgl3-demos), and tryed see what can be different compared to my project.

It turned out that with the same code but different dependencies, the demo with VMA may or may not work :). E.g let's take SimpleTriangle demo from https://github.com/LWJGL/lwjgl3-demos. If I just delete jemalloc from pom.xml I get core dump. Here I'm attaching the pom with the jemalloc removed: pom.xml.txt An analogous thing applies to the attached MCVE.

So from my analyses, the key is to add dependency to jemalloc. When added, everything works(demos samples, my mvce sample and my engine). Maybe this will help track down the problem.

PS. On 3.3.1 everything works for me without adding dependency to jemalloc.

Spasi commented 1 year ago

@Illithidek Thanks for the MCVE!

I reproduced the crash and identified the problem. VMA uses aligned_alloc for all internal allocations, even when it doesn't need any special alignment (i.e. higher than malloc's guaranteed minimum). Turns out that one of those allocations is done with an alignment of 2, which is problematic on Linux because the system allocator implementation (the one used when jemalloc isn't available) uses posix_memalign internally. That function requires that alignment is a multiple of sizeof(void *), so it fails, returns NULL and VMA crashes.

The reason LWJGL uses posix_memalign had to do with the ridiculous GLIBC compatibility situation on Linux. The minimum GLIBC version required was bumped in LWJGL 3.3.0, so it should be fine to switch to aligned_alloc now. Will be fixed in the next snapshot.