"OSError: exception: access violation writing 0x0000000000000010" of Koboldcpp-rocm on RX6700XT

youngzyl commented 9 months ago

This issue happens on the latest test build release of koboldcpp-rocm

Computer specs: CPU: R5 5600 RAM: 32G GPU: RX6700XT

Loading model: C:\Users\test\LLM\koboldcpp\gguf\models\openchat\openchat-3.5-0106.Q5_K_M.gguf [Threads: 5, BlasThreads: 5, SmartContext: False, ContextShift: True]

The reported GGUF Arch is: llama

Identified as GGUF model: (ver 6) Attempting to Load...

Using automatic RoPE scaling. If the model has customized RoPE settings, they will be used directly instead! System Info: AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = >0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | ggml_init_cublas: found 1 ROCm devices: Device 0: AMD Radeon RX 6700 XT, compute capability 10.3, VMM: no Traceback (most recent call last): File "koboldcpp.py", line 2946, in File "koboldcpp.py", line 2789, in main File "koboldcpp.py", line 351, in load_model OSError: exception: access violation writing 0x0000000000000010 [21796] Failed to execute script 'koboldcpp' due to unhandled exception!

TheLapinMalin commented 9 months ago

Same issue here with a 6750XT.

It loads most models correctly, but fails on the latest quantized Gemma models (e.g. https://huggingface.co/rahuldshetty/gemma-7b-it-gguf-quantized or https://huggingface.co/LoneStriker/gemma-7b-it-GGUF/tree/main).

LostRuins commented 9 months ago

Support for gemma has not been included in the 1.58 yet, you may have to try again next release.

LostRuins / koboldcpp

"OSError: exception: access violation writing 0x0000000000000010" of Koboldcpp-rocm on RX6700XT #657

Identified as GGUF model: (ver 6) Attempting to Load...