likelovewant / ROCmLibs-for-gfx1103-AMD780M-APU

ROCm Library Files for gfx1103 and update with others arches based on AMD GPUs for use in Windows.
GNU General Public License v3.0
29 stars 4 forks source link

Ryzen 7 4700u - gfx90c:xnack- not supported supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx906 gfx90c]" #4

Open delta-whiplash opened 3 weeks ago

delta-whiplash commented 3 weeks ago

Hello I tryed to run ollama on windows with the patch rocm provided in this repo but when I try to start ollama I have this warning :

time=2024-06-17T14:49:12.220+02:00 level=WARN source=amd_windows.go:95 msg="amdgpu is not supported" gpu=0 gpu_type=gfx90c:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx906 gfx90c]"
time=2024-06-17T14:49:12.220+02:00 level=WARN source=amd_windows.go:97 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-06-17T14:49:12.288+02:00 level=INFO source=types.go:71 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="11.9 GiB"

I have a 32GB of ram laptop with 16GB allocated to the system and 16GB to the igpu through this other project

can someone help me to use my igpu am tying to have inference on phi3:medium-128k

likelovewant commented 3 weeks ago

try download the ollamasetup.exe installer from [here](https://github.com/likelovewant/ollama-for-amd/releases .and retry the steps again .you gpu has detected as gpu_type=gfx90c:xnack- rather than gfx90c , according its the talk here , gfx90c including both gfx90c:xnack- and xnack+ . For some reasons, or your update gpu driver set you gpu to xnack-, Had received feedback , this gfx90c rocm can run Ollama on gpu .not sure ,how does it not work ? perhaps , your could build a new rocm Rocblas for gfx90c:xnack- by using the reference build guide available on this repo wiki and package gfx90c , by searching the gfx90c replace with gfx90c:xnack- in the entire folder of gfx90c and build the Rocblas again . or wait for the next release for gfx90c:xnack- rocm .

delta-whiplash commented 3 weeks ago

thanks for your answer @likelovewant Am using latest version of all the tools we discussed about Am already using ollama-for-amd (sry I didn't say it) I never builded anything and am on windows 11 how can I build for gfx90c:xnack- ?

likelovewant commented 3 weeks ago

thanks for your answer @likelovewant Am using latest version of all the tools we discussed about Am already using ollama-for-amd (sry I didn't say it) I never builded anything and am on windows 11 how can I build for gfx90c:xnack- ?

gfx90c, the Rocblas support file and guide and wiki method available on the top bar of the repo, using those guide to build Rocblas for gfx90c:xnack-, also the Ollama . or wait for the next release within a week .need to more test for the xnack- feature due to it's kindly old techs.

delta-whiplash commented 3 weeks ago

thanks for your answer @likelovewant Am using latest version of all the tools we discussed about Am already using ollama-for-amd (sry I didn't say it) I never builded anything and am on windows 11 how can I build for gfx90c:xnack- ?

gfx90c, the Rocblas support file and guide and wiki method available on the top bar of the repo, using those guide to build Rocblas for gfx90c:xnack-, also the Ollama . or wait for the next release within a week .need to more test for the xnack- feature due to it's kindly old techs.

The next release will support my Apu ? So I should just wait few days?

likelovewant commented 3 weeks ago

thanks for your answer @likelovewant Am using latest version of all the tools we discussed about Am already using ollama-for-amd (sry I didn't say it) I never builded anything and am on windows 11 how can I build for gfx90c:xnack- ?

gfx90c, the Rocblas support file and guide and wiki method available on the top bar of the repo, using those guide to build Rocblas for gfx90c:xnack-, also the Ollama . or wait for the next release within a week .need to more test for the xnack- feature due to it's kindly old techs.

The next release will support my Apu ? So I should just wait few days?

we can make the gfx90c support even official not support it . and maybe that's possible for the gfx90c:xnack- , it's may need to do some tweaks . I will try to add it in next release. you can wait for my release or build yourself , as I had put all the guide to the wiki , anyone should able to help themselves.

delta-whiplash commented 3 weeks ago

Am letting you try to build it Thank you I hope you will find a way If it's not working am gonna take a look deeper

Thanks for your answers and help 😊

likelovewant commented 2 weeks ago

Am letting you try to build it Thank you I hope you will find a way If it's not working am gonna take a look deeper

Thanks for your answers and help 😊

Please donwload the new release for v0.1.45-alpha,and replace the rocm file with 90c:xnack- ,gfx90c, test it one by one . due to the files was modified or the results of partial build .no sure it's works . You may test the xnack- with test 1 and test 2 . and also with no xnack (gfx90c , no xnack-). Let me know your result . Have take sometime to build , failed many times and unable to successful build ,however I manuly edit somefiles and copy the files from half build file . have test with all other card with xnack features , all turn out fails . it's may never supported in xnacks features on windows.anyway , test it .let me know the result

delta-whiplash commented 2 weeks ago

oh thank you for your answer am gonna try it tomorrow

network2smb commented 2 weeks ago

Hello, I have a very similar problem. When I start ollama (the patched version from likelovewant/ollama-for-amd) it runs using the CPU only and I see the following in the logs: time=2024-06-26T16:07:40.209-03:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1012:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1033 gfx1034 gfx1035 gfx1036 gfx1100 gfx1101 gfx1102 gfx1103]")

The GPU was supposed to be "gfx1012", but it's reporting with this suffix "xnack-". I read the instructions to build the libs with additional support for other gpu , but as a very noob I found it complicated to undestand.

I see that you tried to help with the gfx90c, so maybe you could generate one of this half build rocm lib for gfx1012:xnack- and I could try it as well. I undestand that you may be busy, and I can wait the next version of course. I'm running llama3 model.

Thanks for your attention

likelovewant commented 2 weeks ago

Hello, I have a very similar problem. When I start ollama (the patched version from likelovewant/ollama-for-amd) it runs using the CPU only and I see the following in the logs: time=2024-06-26T16:07:40.209-03:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1012:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1033 gfx1034 gfx1035 gfx1036 gfx1100 gfx1101 gfx1102 gfx1103]")

The GPU was supposed to be "gfx1012", but it's reporting with this suffix "xnack-". I read the instructions to build the libs with additional support for other gpu , but as a very noob I found it complicated to undestand.

I see that you tried to help with the gfx90c, so maybe you could generate one of this half build rocm lib for gfx1012:xnack- and I could try it as well. I undestand that you may be busy, and I can wait the next version of course. I'm running llama3 model.

Thanks for your attention

This half build tweaks may not able to work. since there are some information are missing . Have not received any feedback yet .

However , there is another approach you may try . based on the talks here, in the older driver , those xnack- tails has not add . you may try older driver released before may ,2021 or tweaks new drivers . by edit the driver set up for navi14 ( gfx1012) , by remove those keys words" xnack-" for navi14 . this file can be found in C:\AMD , serach key words navi14 and xnack- n,edit it and install the drivers in there directely or Edit those C:\Windows\systeme32\ drivers or drives folder with other names ( more risk), may result of your system crash. make sure to back up your system and files before your try . after done . reboot . and rerun the program again . or Do not do anything , I will add gfx1012:xnack- in ollama next release .you may try the older roclabs with gfx1012 with the new release within few hours .

likelovewant commented 1 week ago

Have received feedback from another forum . gfx90c:xnack- are able to running with the new release . with thegfx90c rocblasand library rather than tweakedrocblas gfx90c:xnack-. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here example

test by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)

./ollama serve

Finally, in a separate shell, run a model:

./ollama run llama3

it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin.

it should also apply togfx1012:xnack- , simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2 in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smb

delta-whiplash commented 1 week ago

Am letting you try to build it Thank you I hope you will find a way If it's not working am gonna take a look deeper Thanks for your answers and help 😊

Please donwload the new release for v0.1.45-alpha,and replace the rocm file with 90c:xnack- ,gfx90c, test it one by one . due to the files was modified or the results of partial build .no sure it's works . You may test the xnack- with test 1 and test 2 . and also with no xnack (gfx90c , no xnack-). Let me know your result . Have take sometime to build , failed many times and unable to successful build ,however I manuly edit somefiles and copy the files from half build file . have test with all other card with xnack features , all turn out fails . it's may never supported in xnacks features on windows.anyway , test it .let me know the result

sorry for my late reply I was busy

for the test 1 I have this logs :

2024/06/29 15:26:17 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:26:17.289+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-06-29T15:26:17.289+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:26:17.290+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:26:17.290+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:26:17.322+02:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx90c:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx906 gfx90c]"
time=2024-06-29T15:26:17.322+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-06-29T15:26:17.388+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="9.8 GiB"

for test2 I have this :


2024/06/29 15:27:46 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:27:46.117+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-06-29T15:27:46.117+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:27:46.118+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:27:46.118+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:27:46.142+02:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx90c:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx906 gfx90c-xnack- gfx90c]"
time=2024-06-29T15:27:46.142+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-06-29T15:27:46.220+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="9.8 GiB"

and am facing the same issue with the gfx90c

2024/06/29 15:29:15 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:29:15.456+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-06-29T15:29:15.457+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:29:15.457+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:29:15.458+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:29:15.489+02:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx90c:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx906 gfx90c-xnack- gfx90c]"
time=2024-06-29T15:29:15.489+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-06-29T15:29:15.546+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="9.9 GiB"
delta-whiplash commented 1 week ago

Have received feedback from another forum . gfx90c:xnack- are able to running with the new release . with thegfx90c rocblasand library rather than tweakedrocblas gfx90c:xnack-. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here example

test by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)

./ollama serve

Finally, in a separate shell, run a model:

./ollama run llama3

it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin.

it should also apply togfx1012:xnack- , simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2 in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smb

When I try to do this it's seems working in the serve cmd:

PS C:\Users\Delta\AppData\Local\Programs\Ollama> .\ollama.exe serve
2024/06/29 15:39:59 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:9.0.12 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:39:59.479+02:00 level=INFO source=images.go:730 msg="total blobs: 5"
time=2024-06-29T15:39:59.479+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:39:59.480+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:39:59.481+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:39:59.514+02:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=9.0.12
time=2024-06-29T15:39:59.878+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx90c:xnack- driver=0.0 name="AMD Radeon(TM) Graphics" total="21.8 GiB" available="21.6 GiB"
delta-whiplash commented 1 week ago

nevermind it's not working When I try to run a model I have this error time=2024-06-29T15:47:55.170+02:00 level=ERROR source=sched.go:388 msg="error loading llama server" error="llama runner process has terminated: exit status 0xc0000005 " [GIN] 2024/06/29 - 15:47:55 | 500 | 839.6922ms | 127.0.0.1 | POST "/api/chat"

I have this issue with the v0.1.45-alpha and I also have it on v0.1.46-alpha

likelovewant commented 1 week ago

nevermind it's not working When I try to run a model I have this error time=2024-06-29T15:47:55.170+02:00 level=ERROR source=sched.go:388 msg="error loading llama server" error="llama runner process has terminated: exit status 0xc0000005 " [GIN] 2024/06/29 - 15:47:55 | 500 | 839.6922ms | 127.0.0.1 | POST "/api/chat"

Try a different model . it's means the models not correctly loaded . Or the memory or response time out .

delta-whiplash commented 1 week ago

nevermind it's not working When I try to run a model I have this error time=2024-06-29T15:47:55.170+02:00 level=ERROR source=sched.go:388 msg="error loading llama server" error="llama runner process has terminated: exit status 0xc0000005 " [GIN] 2024/06/29 - 15:47:55 | 500 | 839.6922ms | 127.0.0.1 | POST "/api/chat"

Try a different model . it's means the models not correctly loaded . Or the memory or response time out . fix some bug in new release

I did on phi3 and llama3 same issue

delta-whiplash commented 1 week ago

furthermore my memory reporting is wrong I have 16gb for windows and 16GB for the vram

likelovewant commented 1 week ago

furthermore my memory reporting is wrong I have 16gb for windows and 16GB for the vram

can send me the full log here @delta-whiplash , i think it might correct . since the vram include 16 dedicted vram and others for share igpu. you'll get this info in the windows manager

likelovewant commented 1 week ago

update it v0.1.48-alpha, fix some bugs ...

likelovewant commented 1 week ago

Have received feedback from another forum . gfx90c:xnack- are able to running with the new release . with thegfx90c rocblasand library rather than tweakedrocblas gfx90c:xnack-. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here example test by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)

./ollama serve

Finally, in a separate shell, run a model:

./ollama run llama3

it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin. it should also apply togfx1012:xnack- , simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2 in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smb

When I try to do this it's seems working in the serve cmd:

PS C:\Users\Delta\AppData\Local\Programs\Ollama> .\ollama.exe serve
2024/06/29 15:39:59 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:9.0.12 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:39:59.479+02:00 level=INFO source=images.go:730 msg="total blobs: 5"
time=2024-06-29T15:39:59.479+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:39:59.480+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:39:59.481+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:39:59.514+02:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=9.0.12
time=2024-06-29T15:39:59.878+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx90c:xnack- driver=0.0 name="AMD Radeon(TM) Graphics" total="21.8 GiB" available="21.6 GiB"

update , by start the server in C:\Users\usrname\AppData\Local\Programs\Ollama\ollama_runners\rocm_v5.7

likelovewant commented 1 week ago

gfx 90c xnack- , please use new build rocblas for gfx90c xnack-.7z gfx1012 xnack- , please use new build rocblas for gfx1012:xnack- with building guide.7z @delta-whiplash @network2smb

delta-whiplash commented 1 week ago

gfx 90c xnack- , please use new build rocblas for gfx90c xnack-.7z gfx1012 xnack- , please use new build rocblas for gfx1012:xnack- with building guide.7z @delta-whiplash @network2smb

I tryed it with a fresh install your fix seems working for the serve start :

2024/07/01 14:11:14 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:9.0.12 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-01T14:11:14.771+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-07-01T14:11:14.771+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-07-01T14:11:14.772+02:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.48-alpha-3-g18f2ec5)"
time=2024-07-01T14:11:14.772+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-07-01T14:11:14.797+02:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=9.0.12
time=2024-07-01T14:11:15.155+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx90c:xnack- driver=0.0 name="AMD Radeon(TM) Graphics" total="21.8 GiB" available="21.6 GiB"

when I try to run llama3 or phi3 I have this issue :

PS C:\Users\Delta> ollama run phi3
pulling manifest
pulling b26e6713dc74... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 2.4 GB
pulling fa8235e5b48f... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 1.1 KB
pulling 542b217f179c... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  148 B
pulling 8dde1baf1db0... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–   78 B
pulling f91db7a2deb9... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
Error: llama runner process has terminated: exit status 0xc0000409 error:Could not initialize Tensile library
PS C:\Users\Delta> ollama run llama3
pulling manifest
pulling 6a0746a1ec1a... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 4.7 GB
pulling 4fa551d4f938... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  12 KB
pulling 8ab4849b038c... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  254 B
pulling 577073ffcc6c... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  110 B
pulling 3f8eb4da87fa... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
Error: llama runner process has terminated: exit status 0xc0000409 error:Could not initialize Tensile library
PS C:\Users\Delta>

the full logs are this one server.log

delta-whiplash commented 1 week ago

image image

this is my laptop specs

likelovewant commented 1 week ago

gfx 90c xnack- , please use new build rocblas for gfx90c xnack-.7z gfx1012 xnack- , please use new build rocblas for gfx1012:xnack- with building guide.7z @delta-whiplash @network2smb

I tryed it with a fresh install your fix seems working for the serve start :

2024/07/01 14:11:14 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:9.0.12 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-01T14:11:14.771+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-07-01T14:11:14.771+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-07-01T14:11:14.772+02:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.48-alpha-3-g18f2ec5)"
time=2024-07-01T14:11:14.772+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-07-01T14:11:14.797+02:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=9.0.12
time=2024-07-01T14:11:15.155+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx90c:xnack- driver=0.0 name="AMD Radeon(TM) Graphics" total="21.8 GiB" available="21.6 GiB"

when I try to run llama3 or phi3 I have this issue :

PS C:\Users\Delta> ollama run phi3
pulling manifest
pulling b26e6713dc74... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 2.4 GB
pulling fa8235e5b48f... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 1.1 KB
pulling 542b217f179c... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  148 B
pulling 8dde1baf1db0... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–   78 B
pulling f91db7a2deb9... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
Error: llama runner process has terminated: exit status 0xc0000409 error:Could not initialize Tensile library
PS C:\Users\Delta> ollama run llama3
pulling manifest
pulling 6a0746a1ec1a... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– 4.7 GB
pulling 4fa551d4f938... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  12 KB
pulling 8ab4849b038c... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  254 B
pulling 577073ffcc6c... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  110 B
pulling 3f8eb4da87fa... 100% β–•β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
Error: llama runner process has terminated: exit status 0xc0000409 error:Could not initialize Tensile library
PS C:\Users\Delta>

the full logs are this one server.log

Try remove the HSA_OVERRIDE_GFX_VERSION:9.0.12 set in the variation , also replace the rocblas.dll and library in both hip sdk and ollama/.../rocm , and use the version 2 rocblas for gfx90c xnack- V2.7z , the sever log report the issue came from rocblas.dll ,mistake made during build . fixed with gfx90c xnack- V2

delta-whiplash commented 1 week ago

@likelovewant

2024/07/01 17:35:03 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-01T17:35:03.173+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-07-01T17:35:03.174+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-07-01T17:35:03.174+02:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.48-alpha-3-g18f2ec5)"
time=2024-07-01T17:35:03.174+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm_v5.7 cpu cpu_avx]"
time=2024-07-01T17:35:03.206+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-07-01T17:35:03.269+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="13.7 GiB"

when I remove HSA_OVERRIDE_GFX_VERSION:9.0.12

likelovewant commented 1 week ago

@likelovewant


2024/07/01 17:35:03 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"

time=2024-07-01T17:35:03.173+02:00 level=INFO source=images.go:730 msg="total blobs: 0"

time=2024-07-01T17:35:03.174+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"

time=2024-07-01T17:35:03.174+02:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.48-alpha-3-g18f2ec5)"

time=2024-07-01T17:35:03.174+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm_v5.7 cpu cpu_avx]"

time=2024-07-01T17:35:03.206+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"

time=2024-07-01T17:35:03.269+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="13.7 GiB"

when I remove HSA_OVERRIDE_GFX_VERSION:9.0.12

Is it stop there or are able to loading the models? @delta-whiplash

likelovewant commented 1 week ago

update with v1.48-alpha2 tag, start the ollama as normal ways . eg ollama run llama instead of ./ollama serve. should be fixed the previously the unsupported state. ignore the

WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"

It's from ollama igpu unsupported debug information . This repo remove the igpu unsupported limits. Leave this line sever as an optional supported info only .

network2smb commented 1 week ago

Have received feedback from another forum . gfx90c:xnack- are able to running with the new release . with thegfx90c rocblasand library rather than tweakedrocblas gfx90c:xnack-. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here example

test by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)

./ollama serve

Finally, in a separate shell, run a model:

./ollama run llama3

it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin.

it should also apply togfx1012:xnack- , simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2 in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smb

@likelovewant, I tried what you recommended (set the env var HSA_OVERRIDE_GFX_VERSION=10.1.2) and I can confirm that it works. Used ollama with mistral and llama3.

2024/07/02 12:04:19 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:10.1.2 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\network2smb\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\network2smb\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-02T12:04:19.511-03:00 level=INFO source=images.go:730 msg="total blobs: 7"
time=2024-07-02T12:04:19.515-03:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-07-02T12:04:19.520-03:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-07-02T12:04:19.523-03:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm_v5.7 cpu cpu_avx]"
time=2024-07-02T12:04:19.608-03:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.1.2
time=2024-07-02T12:04:21.586-03:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx1012:xnack- driver=0.0 name="AMD Radeon Pro W5500" total="8.0 GiB" available="7.9 GiB"

Additionally, I had the following error when running ollama run: Error: error reading llm response: read tcp 127.0.0.1:56072->127.0.0.1:56053: wsarecv: An existing connection was forcibly closed by the remote host.

The curious thing is that it seems to happen only the first and second time I run a new model. Happened 2 times after downloaded llama3 and again with mistral. After that, the error disappeared. Anyway, now it is running using the GPU.

Thanks a lot for your help. And thanks @delta-whiplash for opening this thread. I hope you succeed and solve your issue as well

likelovewant commented 1 week ago

Have received feedback from another forum . gfx90c:xnack- are able to running with the new release . with thegfx90c rocblasand library rather than tweakedrocblas gfx90c:xnack-. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here example

test by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)


./ollama serve

Finally, in a separate shell, run a model:


./ollama run llama3

it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin.

it should also apply togfx1012:xnack- , simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2 in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smb

@likelovewant, I tried what you recommended (set the env var HSA_OVERRIDE_GFX_VERSION=10.1.2) and I can confirm that it works. Used ollama with mistral and llama3.


2024/07/02 12:04:19 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:10.1.2 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\network2smb\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\network2smb\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"

time=2024-07-02T12:04:19.511-03:00 level=INFO source=images.go:730 msg="total blobs: 7"

time=2024-07-02T12:04:19.515-03:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"

time=2024-07-02T12:04:19.520-03:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"

time=2024-07-02T12:04:19.523-03:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm_v5.7 cpu cpu_avx]"

time=2024-07-02T12:04:19.608-03:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.1.2

time=2024-07-02T12:04:21.586-03:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx1012:xnack- driver=0.0 name="AMD Radeon Pro W5500" total="8.0 GiB" available="7.9 GiB"

Additionally, I had the following error when running ollama run:

Error: error reading llm response: read tcp 127.0.0.1:56072->127.0.0.1:56053: wsarecv: An existing connection was forcibly closed by the remote host.

The curious thing is that it seems to happen only the first and second time I run a new model. Happened 2 times after downloaded llama3 and again with mistral. After that, the error disappeared. Anyway, now it is running using the GPU.

Thanks a lot for your help. And thanks @delta-whiplash for opening this thread. I hope you succeed and solve your issue as well

The error show ollama is running while you try to start another serve. check the new update wiki. gfx1012:xnack- is supported in new release v0.1.48 alpha and alpha2. The settings for HSA_OVERRIDE_GFX_VERSION=10.1.2 is no longer need with new rocblas for gfx1012:xnack- in new release. @network2smb