likelovewant / ROCmLibs-for-gfx1103-AMD780M-APU

ROCm Library Files for gfx1103 and update with others arches based on AMD GPUs for use in Windows.
GNU General Public License v3.0
101 stars 12 forks source link

(LM Studio)Failed to load LLM engine from path (6750XT/Windows) #11

Open kaattaalan opened 1 day ago

kaattaalan commented 1 day ago

Tried following the wiki https://github.com/likelovewant/ROCmLibs-for-gfx1103-AMD780M-APU/wiki/Unlock-LM-Studio-on-Any-AMD-GPU-with-ROCm-Guide#using-amd-graphics-cards-with-lm-studio

Copied the files, restarted LMS. The GPU (gfx1031) is still showing up as unsupported image

When tried loading the model anyway, got the following error in LM studio :

Failed to load LLM engine from path: C:\Users\Username.cache\lm-studio\extensions\backends\win-llama-rocm-0.2.31\llm_engine_rocm.node. The specified module could not be found. \?\C:\Users\Username.cache\lm-studio\extensions\backends\win-llama-rocm-0.2.31\llm_engine_rocm.node

likelovewant commented 15 hours ago

Compatibility Issue with "compatible" Keyword

The compatible keyword might not be working currently due to recent updates from LM Studios. This issue is expected to be resolved in the LM studio next release.

Workaround for Now:

While waiting for the fix, you can still make it work by following these steps:

  1. Ignore the compatible keyword:
  2. Use ROCm 5.7 Libraries: Ensure your rocm extension uses the ROCm 5.7 libraries as LM Studio currently only supports this version.
  3. Replace Llama and GGML DLLs: Download and replace llama.dll and ggml.dll from Ollama for amd repository to your lmstudio extension works .

Choosing the Correct Ollama Version:

kaattaalan commented 3 hours ago

It works !!

Thanks a lot for the reply. I tried your Workaround and the models are loading and generating now. (I am getting GGGG output, but fixed it by turning Flash attention on)

But for some reason, it only works for the version 1.1.5 of the extension. If I try updating it to 1.1.10 (to see whether GGG output will be fixed) and do the same steps, it won't work anymore : image I installed the extension by following the guide : https://github.com/lmstudio-ai/configs/blob/main/Extension-Pack-Instructions.md

Here's what I did (for reference)

  1. I downloaded v0.3.6 of ollama-for-amd
  2. Copied ollama-windows-amd64\ollama_runners\rocm_v5.7 llama and ggml dlls to lm-studio\extensions\backends\llama.cpp-win-x86_64-amd-rocm-avx2-1.1.10 replacing the existing llama.dll
  3. Then I downloaded v0.5.7of ROCmLibs-for-gfx1103-AMD780M-APU (https://github.com/likelovewant/ROCmLibs-for-gfx1103-AMD780M-APU/releases/download/v0.5.7/rocm.gfx1031.for.hip.sdk.5.7.optimized.with.little.wu.logic.and.I8II.support.7z)
  4. I then extracted the rocblas.dll and library contents to lm-studio\extensions\backends\vendor\win-llama-rocm-vendor-v2
  5. Restarted LM studio, chose the correct runtime : image
  6. Then I was able to load the model (with flash attention on) without errors, even though it shows up as incompatible.