ggerganov / llama.cpp

LLM inference in C/C++
MIT License
67.82k stars 9.73k forks source link

Bug: [vulkan] llama.cpp not work on Raspberry Pi 5 #9801

Open FanShupei opened 1 month ago

FanShupei commented 1 month ago

What happened?

I notice https://github.com/ggerganov/llama.cpp/issues/5237 is the prior issue for the same bug. However, it was closed neither confirmed nor fixed.

Run with validation layer enabled, the validation layer reports some issues. I'd like to ask whether these issues is fixable, or it indicates the underlying GPU lacks important features to run llama.cpp (so it's unfixable).

Reproduce commad (run any model): VK_INSTANCE_LAYERS=VK_LAYER_KHRONOS_validation gdb build/bin/llama-bench -m models/llama32-1b-instruct-f16.gguf

The full log generated by validation layer is too long. See https://pastebin.com/nRJgLFY4

Name and Version

version: 3880 (f3fdcfaa) built with cc (Debian 12.2.0-14) 12.2.0 for aarch64-linux-gnu Platform: Respberry Pi 5 OS: Debian 12 vulkan loader version: 1.3.239 vulkan device: V3D 7.1.7 (apiVersion: 1.2.255, 23.2.1)

What operating system are you seeing the problem on?

No response

Relevant log output

No response

FanShupei commented 1 month ago

I'd like to summarize the validation layer error log, all error happens in ggml_vk_create_pipeline_func. All errors may be categorized to two types:

  1. vkCreatePipelineLayout(): sum of storage buffer bindings among all stages (3) exceeds device maxDescriptorSetUpdateAfterBindStorageBuffers limit (0)
  2. Shader uses 33792 bytes of shared memory, more than allowed by physicalDeviceLimits::maxComputeSharedMemorySize (16384)

For error 2, I understand it says the shader uses too many shared memory, we may fix it by rewrite the shader. For error 1, I really don't understand its meaning.

Below is part of the error message:

    Objects: 1
        [0] 0x55563c26f460, type: 3, name: NULL
VUID-VkPipelineLayoutCreateInfo-pSetLayouts-03039(ERROR / SPEC): msgNum: 2004556686 - Validation Error: [ VUID-VkPipelineLayoutCreateInfo-pSetLayouts-03039 ] Object 0: handle = 0x55563c26f460, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x777b1b8e | vkCreatePipelineLayout(): sum of storage buffer bindings among all stages (3) exceeds device maxDescriptorSetUpdateAfterBindStorageBuffers limit (0). The Vulkan spec states: The total number of descriptors of the type VK_DESCRIPTOR_TYPE_STORAGE_BUFFER accessible across all shader stages and across all elements of pSetLayouts must be less than or equal to VkPhysicalDeviceDescriptorIndexingProperties::maxDescriptorSetUpdateAfterBindStorageBuffers (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-VkPipelineLayoutCreateInfo-pSetLayouts-03039)
    Objects: 1
        [0] 0x55563c26f460, type: 3, name: NULL
VUID-VkShaderModuleCreateInfo-pCode-01091(ERROR / SPEC): msgNum: -1480880714 - Validation Error: [ VUID-VkShaderModuleCreateInfo-pCode-01091 ] Object 0: handle = 0x55563c26f460, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0xa7bb8db6 | vkCreateShaderModule(): The SPIR-V Capability (Int8) was declared, but none of the requirements were met to use it. The Vulkan spec states: If pCode declares any of the capabilities listed in the SPIR-V Environment appendix, one of the corresponding requirements must be satisfied (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-VkShaderModuleCreateInfo-pCode-01091)
    Objects: 1
        [0] 0x55563c26f460, type: 3, name: NULL
VUID-RuntimeSpirv-Workgroup-06530(ERROR / SPEC): msgNum: -1405964136 - Validation Error: [ VUID-RuntimeSpirv-Workgroup-06530 ] Object 0: handle = 0x13a000000013a, type = VK_OBJECT_TYPE_SHADER_MODULE; | MessageID = 0xac32b098 | Shader uses 33792 bytes of shared memory, more than allowed by physicalDeviceLimits::maxComputeSharedMemorySize (16384) The Vulkan spec states: The sum of size in bytes for variables and padding in the Workgroup storage class in the GLCompute {ExecutionModel} must be less than or equal to maxComputeSharedMemorySize (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-RuntimeSpirv-Workgroup-06530)
    Objects: 1
        [0] 0x13a000000013a, type: 15, name: NULL
Abhranta commented 1 month ago

If you get it running, please tell me the benchmarks numbers if you can. I would love to know if I am getting optimal numbers or not.