Does this ROCm library need to be upgraded to Hawk (HPT) for AMD Ryzen™ 7 8845HS?

fudingyu commented 3 weeks ago

According to the AMD Ryzen AI official website, the AMD Ryzen 7 8845HS CPU uses Hawk (HPT): AMD Ryzen ™ Instead of the Phoenix (PHX) previously used by 7840HS.

So does the current ROCmLib need to be updated to the Hawk version?

https://ryzenai.docs.amd.com/en/latest/inst.html Phoenix (PHX): AMD Ryzen™ 7940HS, 7840HS, 7640HS, 7840U, 7640U. Hawk (HPT): AMD Ryzen™ 8640U, 8640HS, 8645H, 8840U, 8840HS, 8845H, 8945H. Strix (STX): AMD Ryzen™ Ryzen AI 9 HX370, Ryzen AI 9 365

likelovewant commented 3 weeks ago

ROCMLibs ,It's not about the CPU (like Hawk or other versions). This issue is specifically related to your GPU.

If your GPU is an AMD Radeon RX 780M (gfx1103), then you should be able to use this.

This repository was created before the release of Hawk (HPT) like the 8840HS. It uses labels from the earlier "Phoenix" generation, but you can still use it.

fudingyu commented 3 weeks ago

But I can't run it on 8845HS cpu, have a error="llama runnerprocess has terminated: exit status 0xc0000005"

https://github.com/likelovewant/ollama-for-amd/issues/16

likelovewant commented 3 weeks ago

But I can't run it on 8845HS cpu, have a error="llama runnerprocess has terminated: exit status 0xc0000005"

likelovewant/ollama-for-amd#16

1 , make sure your have enought dedicted vram set in bios , 8g or 16 g. 2, alway make sure you have updated drivers. 3, HipSDK 6.1.2 , it's not a request. but sometimes , it's helps.

@fudingyu

fudingyu commented 3 weeks ago

But I can't run it on 8845HS cpu, have a error="llama runnerprocess has terminated: exit status 0xc0000005" likelovewant/ollama-for-amd#16

1 , make sure your have enought dedicted vram set in bios , 8g or 16 g. 2, alway make sure you have updated drivers. 3, HipSDK 6.1.2 , it's not a request. but sometimes , it's helps.

@fudingyu

Thank you for your reply,

According to the running log, vram should be sufficient.

`msg="inference compute" id=0 library=rocm variant="" compute=gfx1103 driver=6.1 name="AMD Radeon 780M Graphics" total="9.0 GiB" available="8.9 GiB"

"" memory.available="[8.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="1.2GiB" memory.required.partial="1.2 GiB" memory.required.kv="96.0 MiB" memory.required.allocations="[1.2 GiB]" memory.weights.total="288.2 MiB" memory.weights.repeating="150.3 MiB" memory.weights.nonrepeating="137.9 MiB" memory.graph.full="298.5 MiB" memory.graph.partial="405.0 MiB" `

My two computers use the same driver version and are running the same installation package download from https://github.com/likelovewant/ollama-for-amd/releases

likelovewant commented 3 weeks ago

But I can't run it on 8845HS cpu, have a error="llama runnerprocess has terminated: exit status 0xc0000005" likelovewant/ollama-for-amd#16

1 , make sure your have enought dedicted vram set in bios , 8g or 16 g. 2, alway make sure you have updated drivers. 3, HipSDK 6.1.2 , it's not a request. but sometimes , it's helps. @fudingyu

Thank you for your reply,

According to the running log, vram should be sufficient.

`msg="inference compute" id=0 library=rocm variant="" compute=gfx1103 driver=6.1 name="AMD Radeon 780M Graphics" total="9.0 GiB" available="8.9 GiB"

"" memory.available="[8.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="1.2GiB" memory.required.partial="1.2 GiB" memory.required.kv="96.0 MiB" memory.required.allocations="[1.2 GiB]" memory.weights.total="288.2 MiB" memory.weights.repeating="150.3 MiB" memory.weights.nonrepeating="137.9 MiB" memory.graph.full="298.5 MiB" memory.graph.partial="405.0 MiB" `

My two computers use the same driver version and are running the same installation package download from https://github.com/likelovewant/ollama-for-amd/releases

encountered similar issues by someone's local PC due to incorrect settings. Unfortunately, I couldn't reproduce same error.

Here's a breakdown of how VRAM allocation works and why you might be seeing limited available memory error report (exit status 0xc0000005):

Understanding VRAM:

Dedicated VRAM: This is the physical memory specifically allocated to your graphics card.
System Memory (RAM): This is the general memory used by your computer for all tasks. Windows can sometimes dynamically allocate some of this RAM as "flexible" VRAM if needed.

Your Situation:

Assume You have 16GB of system memory and memory.available="[8.9 GiB]" reported. This suggests you might be using a significant amount of your system memory for other tasks, leaving less available for flexible VRAM allocation.or using very lower dedicted vram.( check you bios for vram set)
If you've set your dedicated VRAM to 4GB or lower, Windows might be contributing an additional 4GB-8GB from system memory to bring your total VRAM up to around 12GB or 11.9GB.

The "Crash Even with Enough Memory" Issue:

Some models have specific requirements for their VRAM usage.
Even if your total available memory (including flexible VRAM) appears sufficient, the model might still crash if it needs to access data beyond its allocated VRAM space. This can happen even when system memory is technically available.

Troubleshooting Steps:

Try Different Models: Experiment with models of varying sizes to see if certain ones cause crashes despite seemingly enough memory.
Increase Dedicated VRAM (If Possible): If your graphics card allows it ( if not ,try hack it ,had seem someone share this tips on the issue section), try increasing the dedicated VRAM allocation. This might give the model more direct access to memory.
Close Other Programs: Free up system memory by closing unnecessary applications before running your models.

if above steps can not help you .

Here are some potential solutions you could try:

Update Drivers: Ensure you have the latest drivers installed for your hardware. Sometimes a simple reboot can also resolve these issues after your update.
Try an Earlier Version: Downgrade to a previous version of Ollama or the specific model you're using. This might help identify if the issue is related to a recent update.
*Use the ollama Release with ROCmlib Overrides:https://github.com/likelovewant/ROCmLibs-for-gfx1103-AMD780M-APU/releases/tag/v0.6.1.2*
- Download the latest release from https://github.com/ollama/ollama/releases.
- Replace the rocmlibs from this repo, use like other compatible gpu arches.
- Completely close Ollama and open your terminal.
- Set the environment variable: set HSA_OVERRIDE_GFX_VERSION=11.0.3.
- Run your desired command (e.g., ollama run llama3.21).
Test with Alternative LLMs: Try using other LLM clients, such as LM Studio with Vulkan support. If the issue persists across different tools, it might indicate a problem specific to your local environment.

If all else fails, the issue could stem from conflicting software or settings on your system.

Had see some cases , for gfx803 , by repet try run ollama run models name , are finaly able to load the models

Hope this can help your some . @fudingyu

fudingyu commented 3 weeks ago

But copy the files in the rocm_v6.1 directory to the location 'ollama-windows-amd64_for_amd\lib\ollamaI' it can run this command alone, and I can successfully send the post request and get the returned data.

ollama_llama_server.exe --model C:\Users\kkk.ollama\models\blobs\sha256-8de95da68dc485c0889c205384c24642f83ca18d089559c977ffc6a3972a71a8 --ctx-size 8192 --batch-size 512 --embedding --log-disable --n-gpu-layers 25 --parallel 4 --port 61282"

curl.exe -i -X POST http://localhost:61282/completion -H "'Content-type':'application/json'" -d "{`"prompt`": `"Building a website can be done in 10 simple steps:`",`"n_predict`": 128}"

HTTP/1.1 200 OK Access-Control-Allow-Origin: Content-Length: 1146 Content-Type: application/json; charset=utf-8 Keep-Alive: timeout=5, max=5 Server: llama.cpp

{"content":" \n1. Design Your Website \n2. Create a Logo for Your Website \n3. Create an Easy-to-Use Navigation \n4. Choose a Domain Name \n5. Create and Publish Your Website \n\nDo you agree? \n\nCan you see any benefits of website design? Yes. There are several benefits to website design:\n\n1. SEO (Search Engine Optimization): A website designed with SEO in mind will provide your website with more visibility in search engine results.\n\n2. Increased Traffic: A website that looks and functions well can attract more visitors.\n\n3. Accessibility: A","model":"C:\Users\fdy10\.ollama\models\blobs\sha256-8de95da68dc485c0889c205384c24642f83ca18d089559c977ffc6a3972a71a8","slot_id":0,"stop":true,"stopped_eos":false,"stopped_limit":true,"stopped_word":false,"stopping_word":"","timings":{"predicted_ms":1285.926,"predicted_n":128,"predicted_per_second":99.5391647730896,"predicted_per_token_ms":10.046296875,"prompt_ms":38.028,"prompt_n":13,"prompt_per_second":341.85337120016834,"prompt_per_token_ms":2.9252307692307693},"tokens_cached":140,"tokens_evaluated":13,"tokens_predicted":128,"truncated":false}

fudingyu commented 3 weeks ago

Ok, I found the reason. I copied all these files such as msvcp140.dll into the ‘’ollama-windows-amd64_for_amd\lib\ollama\runners\rocm_v6.1\‘ ’directory, and it can be run.

It's just strange why this happened.

likelovewant commented 3 weeks ago

it's seems that works .Once you have OllamaSetup .exe , installed . you may start the server by ./ollama serve inollama-windows-amd64_for_amd then open another terminal in same location ollama run llama3.1, you don't need to specified the model location as ollama will pick it up by default location.

likelovewant commented 3 weeks ago

Ok, I found the reason. I copied all these files such as msvcp140.dll into the ‘’ollama-windows-amd64_for_amd\lib\ollama\runners\rocm_v6.1\‘ ’directory, and it can be run.

It's just strange why this happened.

all others files in the compress file are needed as they are dependents by ollama programs.

likelovewant / ROCmLibs-for-gfx1103-AMD780M-APU

Does this ROCm library need to be upgraded to Hawk (HPT) for AMD Ryzen™ 7 8845HS? #8