Open delta-whiplash opened 3 weeks ago
try download the ollamasetup.exe installer from [here](https://github.com/likelovewant/ollama-for-amd/releases .and retry the steps again .you gpu has detected as gpu_type=gfx90c:xnack- rather than gfx90c , according its the talk here , gfx90c including both gfx90c:xnack- and xnack+ . For some reasons, or your update gpu driver set you gpu to xnack-, Had received feedback , this gfx90c rocm can run Ollama on gpu .not sure ,how does it not work ? perhaps , your could build a new rocm Rocblas for gfx90c:xnack- by using the reference build guide available on this repo wiki and package gfx90c , by searching the gfx90c replace with gfx90c:xnack- in the entire folder of gfx90c and build the Rocblas again . or wait for the next release for gfx90c:xnack- rocm .
thanks for your answer @likelovewant Am using latest version of all the tools we discussed about Am already using ollama-for-amd (sry I didn't say it) I never builded anything and am on windows 11 how can I build for gfx90c:xnack- ?
thanks for your answer @likelovewant Am using latest version of all the tools we discussed about Am already using ollama-for-amd (sry I didn't say it) I never builded anything and am on windows 11 how can I build for gfx90c:xnack- ?
gfx90c, the Rocblas support file and guide and wiki method available on the top bar of the repo, using those guide to build Rocblas for gfx90c:xnack-
, also the Ollama . or wait for the next release within a week .need to more test for the xnack- feature due to it's kindly old techs.
thanks for your answer @likelovewant Am using latest version of all the tools we discussed about Am already using ollama-for-amd (sry I didn't say it) I never builded anything and am on windows 11 how can I build for gfx90c:xnack- ?
gfx90c, the Rocblas support file and guide and wiki method available on the top bar of the repo, using those guide to build Rocblas for
gfx90c:xnack-
, also the Ollama . or wait for the next release within a week .need to more test for the xnack- feature due to it's kindly old techs.
The next release will support my Apu ? So I should just wait few days?
thanks for your answer @likelovewant Am using latest version of all the tools we discussed about Am already using ollama-for-amd (sry I didn't say it) I never builded anything and am on windows 11 how can I build for gfx90c:xnack- ?
gfx90c, the Rocblas support file and guide and wiki method available on the top bar of the repo, using those guide to build Rocblas for
gfx90c:xnack-
, also the Ollama . or wait for the next release within a week .need to more test for the xnack- feature due to it's kindly old techs.The next release will support my Apu ? So I should just wait few days?
we can make the gfx90c support even official not support it . and maybe that's possible for the gfx90c:xnack- , it's may need to do some tweaks . I will try to add it in next release. you can wait for my release or build yourself , as I had put all the guide to the wiki , anyone should able to help themselves.
Am letting you try to build it Thank you I hope you will find a way If it's not working am gonna take a look deeper
Thanks for your answers and help π
Am letting you try to build it Thank you I hope you will find a way If it's not working am gonna take a look deeper
Thanks for your answers and help π
Please donwload the new release for v0.1.45-alpha,and replace the rocm file with 90c:xnack- ,gfx90c, test it one by one . due to the files was modified or the results of partial build .no sure it's works . You may test the xnack- with test 1 and test 2 . and also with no xnack (gfx90c , no xnack-). Let me know your result . Have take sometime to build , failed many times and unable to successful build ,however I manuly edit somefiles and copy the files from half build file . have test with all other card with xnack features , all turn out fails . it's may never supported in xnacks features on windows.anyway , test it .let me know the result
oh thank you for your answer am gonna try it tomorrow
Hello, I have a very similar problem. When I start ollama (the patched version from likelovewant/ollama-for-amd) it runs using the CPU only and I see the following in the logs:
time=2024-06-26T16:07:40.209-03:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1012:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1033 gfx1034 gfx1035 gfx1036 gfx1100 gfx1101 gfx1102 gfx1103]")
The GPU was supposed to be "gfx1012", but it's reporting with this suffix "xnack-". I read the instructions to build the libs with additional support for other gpu , but as a very noob I found it complicated to undestand.
I see that you tried to help with the gfx90c, so maybe you could generate one of this half build rocm lib for gfx1012:xnack- and I could try it as well. I undestand that you may be busy, and I can wait the next version of course. I'm running llama3 model.
Thanks for your attention
Hello, I have a very similar problem. When I start ollama (the patched version from likelovewant/ollama-for-amd) it runs using the CPU only and I see the following in the logs:
time=2024-06-26T16:07:40.209-03:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx1012:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1010 gfx1011 gfx1012 gfx1030 gfx1031 gfx1032 gfx1033 gfx1034 gfx1035 gfx1036 gfx1100 gfx1101 gfx1102 gfx1103]")
The GPU was supposed to be "gfx1012", but it's reporting with this suffix "xnack-". I read the instructions to build the libs with additional support for other gpu , but as a very noob I found it complicated to undestand.
I see that you tried to help with the gfx90c, so maybe you could generate one of this half build rocm lib for gfx1012:xnack- and I could try it as well. I undestand that you may be busy, and I can wait the next version of course. I'm running llama3 model.
Thanks for your attention
This half build tweaks may not able to work. since there are some information are missing . Have not received any feedback yet .
However , there is another approach you may try . based on the talks here, in the older driver , those xnack- tails has not add . you may try older driver released before may ,2021 or tweaks new drivers . by edit the driver set up for navi14 ( gfx1012) , by remove those keys words" xnack-" for navi14 . this file can be found in C:\AMD , serach key words navi14
and xnack-
n,edit it and install the drivers in there directely or Edit those C:\Windows\systeme32\ drivers or drives folder with other names ( more risk), may result of your system crash. make sure to back up your system and files before your try . after done . reboot . and rerun the program again . or Do not do anything , I will add gfx1012:xnack-
in ollama next release .you may try the older roclabs with gfx1012 with the new release within few hours .
Have received feedback from another forum . gfx90c:xnack-
are able to running with the new release . with thegfx90c rocblas
and library rather than tweakedrocblas gfx90c:xnack-
. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here example
test by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)
./ollama serve
Finally, in a separate shell, run a model:
./ollama run llama3
it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin.
it should also apply togfx1012:xnack-
, simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2
in variation. replace gfx1012 rocblas and library available on this repo . and test it .
@delta-whiplash @network2smb
Am letting you try to build it Thank you I hope you will find a way If it's not working am gonna take a look deeper Thanks for your answers and help π
Please donwload the new release for v0.1.45-alpha,and replace the rocm file with 90c:xnack- ,gfx90c, test it one by one . due to the files was modified or the results of partial build .no sure it's works . You may test the xnack- with test 1 and test 2 . and also with no xnack (gfx90c , no xnack-). Let me know your result . Have take sometime to build , failed many times and unable to successful build ,however I manuly edit somefiles and copy the files from half build file . have test with all other card with xnack features , all turn out fails . it's may never supported in xnacks features on windows.anyway , test it .let me know the result
sorry for my late reply I was busy
for the test 1 I have this logs :
2024/06/29 15:26:17 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:26:17.289+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-06-29T15:26:17.289+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:26:17.290+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:26:17.290+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:26:17.322+02:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx90c:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx906 gfx90c]"
time=2024-06-29T15:26:17.322+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-06-29T15:26:17.388+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="9.8 GiB"
for test2 I have this :
2024/06/29 15:27:46 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:27:46.117+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-06-29T15:27:46.117+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:27:46.118+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:27:46.118+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:27:46.142+02:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx90c:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx906 gfx90c-xnack- gfx90c]"
time=2024-06-29T15:27:46.142+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-06-29T15:27:46.220+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="9.8 GiB"
and am facing the same issue with the gfx90c
2024/06/29 15:29:15 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:29:15.456+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-06-29T15:29:15.457+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:29:15.457+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:29:15.458+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:29:15.489+02:00 level=WARN source=amd_windows.go:96 msg="amdgpu is not supported" gpu=0 gpu_type=gfx90c:xnack- library="C:\\Program Files\\AMD\\ROCm\\5.7\\bin" supported_types="[gfx1030 gfx1100 gfx1101 gfx1102 gfx906 gfx90c-xnack- gfx90c]"
time=2024-06-29T15:29:15.489+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-06-29T15:29:15.546+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="9.9 GiB"
Have received feedback from another forum .
gfx90c:xnack-
are able to running with the new release . with thegfx90c rocblas
and library rather than tweakedrocblas gfx90c:xnack-
. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here exampletest by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)
./ollama serve
Finally, in a separate shell, run a model:
./ollama run llama3
it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin.
it should also apply to
gfx1012:xnack-
, simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2
in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smb
When I try to do this it's seems working in the serve cmd:
PS C:\Users\Delta\AppData\Local\Programs\Ollama> .\ollama.exe serve
2024/06/29 15:39:59 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:9.0.12 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-06-29T15:39:59.479+02:00 level=INFO source=images.go:730 msg="total blobs: 5"
time=2024-06-29T15:39:59.479+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-06-29T15:39:59.480+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-06-29T15:39:59.481+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-06-29T15:39:59.514+02:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=9.0.12
time=2024-06-29T15:39:59.878+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx90c:xnack- driver=0.0 name="AMD Radeon(TM) Graphics" total="21.8 GiB" available="21.6 GiB"
nevermind it's not working When I try to run a model I have this error time=2024-06-29T15:47:55.170+02:00 level=ERROR source=sched.go:388 msg="error loading llama server" error="llama runner process has terminated: exit status 0xc0000005 " [GIN] 2024/06/29 - 15:47:55 | 500 | 839.6922ms | 127.0.0.1 | POST "/api/chat"
I have this issue with the v0.1.45-alpha and I also have it on v0.1.46-alpha
nevermind it's not working When I try to run a model I have this error time=2024-06-29T15:47:55.170+02:00 level=ERROR source=sched.go:388 msg="error loading llama server" error="llama runner process has terminated: exit status 0xc0000005 " [GIN] 2024/06/29 - 15:47:55 | 500 | 839.6922ms | 127.0.0.1 | POST "/api/chat"
Try a different model . it's means the models not correctly loaded . Or the memory or response time out .
nevermind it's not working When I try to run a model I have this error time=2024-06-29T15:47:55.170+02:00 level=ERROR source=sched.go:388 msg="error loading llama server" error="llama runner process has terminated: exit status 0xc0000005 " [GIN] 2024/06/29 - 15:47:55 | 500 | 839.6922ms | 127.0.0.1 | POST "/api/chat"
Try a different model . it's means the models not correctly loaded . Or the memory or response time out . fix some bug in new release
I did on phi3 and llama3 same issue
furthermore my memory reporting is wrong I have 16gb for windows and 16GB for the vram
furthermore my memory reporting is wrong I have 16gb for windows and 16GB for the vram
can send me the full log here @delta-whiplash , i think it might correct . since the vram include 16 dedicted vram and others for share igpu. you'll get this info in the windows manager
update it v0.1.48-alpha, fix some bugs ...
Have received feedback from another forum .
gfx90c:xnack-
are able to running with the new release . with thegfx90c rocblas
and library rather than tweakedrocblas gfx90c:xnack-
. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here example test by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)./ollama serve
Finally, in a separate shell, run a model:
./ollama run llama3
it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin. it should also apply to
gfx1012:xnack-
, simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2
in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smbWhen I try to do this it's seems working in the serve cmd:
PS C:\Users\Delta\AppData\Local\Programs\Ollama> .\ollama.exe serve 2024/06/29 15:39:59 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:9.0.12 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-06-29T15:39:59.479+02:00 level=INFO source=images.go:730 msg="total blobs: 5" time=2024-06-29T15:39:59.479+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0" time=2024-06-29T15:39:59.480+02:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)" time=2024-06-29T15:39:59.481+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]" time=2024-06-29T15:39:59.514+02:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=9.0.12 time=2024-06-29T15:39:59.878+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx90c:xnack- driver=0.0 name="AMD Radeon(TM) Graphics" total="21.8 GiB" available="21.6 GiB"
update , by start the server in C:\Users\usrname\AppData\Local\Programs\Ollama\ollama_runners\rocm_v5.7
gfx 90c xnack- , please use new build rocblas for gfx90c xnack-.7z gfx1012 xnack- , please use new build rocblas for gfx1012οΌxnack- with building guide.7z @delta-whiplash @network2smb
gfx 90c xnack- , please use new build rocblas for gfx90c xnack-.7z gfx1012 xnack- , please use new build rocblas for gfx1012οΌxnack- with building guide.7z @delta-whiplash @network2smb
I tryed it with a fresh install your fix seems working for the serve start :
2024/07/01 14:11:14 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:9.0.12 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-01T14:11:14.771+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-07-01T14:11:14.771+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-07-01T14:11:14.772+02:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.48-alpha-3-g18f2ec5)"
time=2024-07-01T14:11:14.772+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]"
time=2024-07-01T14:11:14.797+02:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=9.0.12
time=2024-07-01T14:11:15.155+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx90c:xnack- driver=0.0 name="AMD Radeon(TM) Graphics" total="21.8 GiB" available="21.6 GiB"
when I try to run llama3 or phi3 I have this issue :
PS C:\Users\Delta> ollama run phi3
pulling manifest
pulling b26e6713dc74... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 2.4 GB
pulling fa8235e5b48f... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.1 KB
pulling 542b217f179c... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 148 B
pulling 8dde1baf1db0... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 78 B
pulling f91db7a2deb9... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
Error: llama runner process has terminated: exit status 0xc0000409 error:Could not initialize Tensile library
PS C:\Users\Delta> ollama run llama3
pulling manifest
pulling 6a0746a1ec1a... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 4.7 GB
pulling 4fa551d4f938... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 12 KB
pulling 8ab4849b038c... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 254 B
pulling 577073ffcc6c... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 110 B
pulling 3f8eb4da87fa... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
Error: llama runner process has terminated: exit status 0xc0000409 error:Could not initialize Tensile library
PS C:\Users\Delta>
the full logs are this one server.log
this is my laptop specs
gfx 90c xnack- , please use new build rocblas for gfx90c xnack-.7z gfx1012 xnack- , please use new build rocblas for gfx1012οΌxnack- with building guide.7z @delta-whiplash @network2smb
I tryed it with a fresh install your fix seems working for the serve start :
2024/07/01 14:11:14 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:9.0.12 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-07-01T14:11:14.771+02:00 level=INFO source=images.go:730 msg="total blobs: 0" time=2024-07-01T14:11:14.771+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0" time=2024-07-01T14:11:14.772+02:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.48-alpha-3-g18f2ec5)" time=2024-07-01T14:11:14.772+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 rocm_v5.7]" time=2024-07-01T14:11:14.797+02:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=9.0.12 time=2024-07-01T14:11:15.155+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx90c:xnack- driver=0.0 name="AMD Radeon(TM) Graphics" total="21.8 GiB" available="21.6 GiB"
when I try to run llama3 or phi3 I have this issue :
PS C:\Users\Delta> ollama run phi3 pulling manifest pulling b26e6713dc74... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 2.4 GB pulling fa8235e5b48f... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 1.1 KB pulling 542b217f179c... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 148 B pulling 8dde1baf1db0... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 78 B pulling f91db7a2deb9... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 485 B verifying sha256 digest writing manifest removing any unused layers success Error: llama runner process has terminated: exit status 0xc0000409 error:Could not initialize Tensile library PS C:\Users\Delta> ollama run llama3 pulling manifest pulling 6a0746a1ec1a... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 4.7 GB pulling 4fa551d4f938... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 12 KB pulling 8ab4849b038c... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 254 B pulling 577073ffcc6c... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 110 B pulling 3f8eb4da87fa... 100% ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ 485 B verifying sha256 digest writing manifest removing any unused layers success Error: llama runner process has terminated: exit status 0xc0000409 error:Could not initialize Tensile library PS C:\Users\Delta>
the full logs are this one server.log
Try remove the HSA_OVERRIDE_GFX_VERSION:9.0.12 set in the variation οΌ also replace the rocblas.dll and library in both hip sdk and ollama/.../rocm , and use the version 2 rocblas for gfx90c xnack- V2.7z , the sever log report the issue came from rocblas.dll ,mistake made during build . fixed with gfx90c xnack- V2
@likelovewant
2024/07/01 17:35:03 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-01T17:35:03.173+02:00 level=INFO source=images.go:730 msg="total blobs: 0"
time=2024-07-01T17:35:03.174+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-07-01T17:35:03.174+02:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.48-alpha-3-g18f2ec5)"
time=2024-07-01T17:35:03.174+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm_v5.7 cpu cpu_avx]"
time=2024-07-01T17:35:03.206+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
time=2024-07-01T17:35:03.269+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="13.7 GiB"
when I remove HSA_OVERRIDE_GFX_VERSION:9.0.12
@likelovewant
2024/07/01 17:35:03 routes.go:1064: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\Delta\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\Delta\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-07-01T17:35:03.173+02:00 level=INFO source=images.go:730 msg="total blobs: 0" time=2024-07-01T17:35:03.174+02:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0" time=2024-07-01T17:35:03.174+02:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.48-alpha-3-g18f2ec5)" time=2024-07-01T17:35:03.174+02:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm_v5.7 cpu cpu_avx]" time=2024-07-01T17:35:03.206+02:00 level=WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage" time=2024-07-01T17:35:03.269+02:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=cpu compute="" driver=0.0 name="" total="15.9 GiB" available="13.7 GiB"
when I remove HSA_OVERRIDE_GFX_VERSION:9.0.12
Is it stop there or are able to loading the models? @delta-whiplash
update with v1.48-alpha2 tag, start the ollama as normal ways . eg ollama run llama
instead of ./ollama serve. should be fixed the previously the unsupported state. ignore the
WARN source=amd_windows.go:98 msg="See https://github.com/ollama/ollama/blob/main/docs/troubleshooting.md for HSA_OVERRIDE_GFX_VERSION usage"
It's from ollama igpu unsupported debug information . This repo remove the igpu unsupported limits. Leave this line sever as an optional supported info only .
Have received feedback from another forum .
gfx90c:xnack-
are able to running with the new release . with thegfx90c rocblas
and library rather than tweakedrocblas gfx90c:xnack-
. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here exampletest by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)
./ollama serve
Finally, in a separate shell, run a model:
./ollama run llama3
it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin.
it should also apply to
gfx1012:xnack-
, simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2
in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smb
@likelovewant, I tried what you recommended (set the env var HSA_OVERRIDE_GFX_VERSION=10.1.2) and I can confirm that it works. Used ollama with mistral and llama3.
2024/07/02 12:04:19 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:10.1.2 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\network2smb\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\network2smb\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]"
time=2024-07-02T12:04:19.511-03:00 level=INFO source=images.go:730 msg="total blobs: 7"
time=2024-07-02T12:04:19.515-03:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0"
time=2024-07-02T12:04:19.520-03:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)"
time=2024-07-02T12:04:19.523-03:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm_v5.7 cpu cpu_avx]"
time=2024-07-02T12:04:19.608-03:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.1.2
time=2024-07-02T12:04:21.586-03:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx1012:xnack- driver=0.0 name="AMD Radeon Pro W5500" total="8.0 GiB" available="7.9 GiB"
Additionally, I had the following error when running ollama run:
Error: error reading llm response: read tcp 127.0.0.1:56072->127.0.0.1:56053: wsarecv: An existing connection was forcibly closed by the remote host.
The curious thing is that it seems to happen only the first and second time I run a new model. Happened 2 times after downloaded llama3 and again with mistral. After that, the error disappeared. Anyway, now it is running using the GPU.
Thanks a lot for your help. And thanks @delta-whiplash for opening this thread. I hope you succeed and solve your issue as well
Have received feedback from another forum .
gfx90c:xnack-
are able to running with the new release . with thegfx90c rocblas
and library rather than tweakedrocblas gfx90c:xnack-
. except need to add extra steps by setting HSA_OVERRIDE_GFX_VERSION=9.0.12 in variation as test by here exampletest by start the server in the ollama default progam directory (ie ;C:\Users\UserName\AppData\Local\Programs\Ollama)
./ollama serve
Finally, in a separate shell, run a model:
./ollama run llama3
it's able to run on the gpu ,but the program running statues not showing on the windows manager . you may check your gpu usge by amd driver manager adrenalin.
it should also apply to
gfx1012:xnack-
, simply settingHSA_OVERRIDE_GFX_VERSION=10.1.2
in variation. replace gfx1012 rocblas and library available on this repo . and test it . @delta-whiplash @network2smb@likelovewant, I tried what you recommended (set the env var HSA_OVERRIDE_GFX_VERSION=10.1.2) and I can confirm that it works. Used ollama with mistral and llama3.
2024/07/02 12:04:19 routes.go:1060: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION:10.1.2 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:C:\\Users\\network2smb\\.ollama\\models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://*] OLLAMA_RUNNERS_DIR:C:\\Users\\network2smb\\AppData\\Local\\Programs\\Ollama\\ollama_runners OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-07-02T12:04:19.511-03:00 level=INFO source=images.go:730 msg="total blobs: 7" time=2024-07-02T12:04:19.515-03:00 level=INFO source=images.go:737 msg="total unused blobs removed: 0" time=2024-07-02T12:04:19.520-03:00 level=INFO source=routes.go:1106 msg="Listening on 127.0.0.1:11434 (version 0.1.45-alpha-0-g0e42bf5-dirty)" time=2024-07-02T12:04:19.523-03:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu_avx2 rocm_v5.7 cpu cpu_avx]" time=2024-07-02T12:04:19.608-03:00 level=INFO source=amd_windows.go:64 msg="skipping rocm gfx compatibility check" HSA_OVERRIDE_GFX_VERSION=10.1.2 time=2024-07-02T12:04:21.586-03:00 level=INFO source=types.go:98 msg="inference compute" id=0 library=rocm compute=gfx1012:xnack- driver=0.0 name="AMD Radeon Pro W5500" total="8.0 GiB" available="7.9 GiB"
Additionally, I had the following error when running ollama run:
Error: error reading llm response: read tcp 127.0.0.1:56072->127.0.0.1:56053: wsarecv: An existing connection was forcibly closed by the remote host.
The curious thing is that it seems to happen only the first and second time I run a new model. Happened 2 times after downloaded llama3 and again with mistral. After that, the error disappeared. Anyway, now it is running using the GPU.
Thanks a lot for your help. And thanks @delta-whiplash for opening this thread. I hope you succeed and solve your issue as well
The error show ollama is running while you try to start another serve. check the new update wiki. gfx1012:xnack- is supported in new release v0.1.48 alpha and alpha2. The settings for HSA_OVERRIDE_GFX_VERSION=10.1.2 is no longer need with new rocblas for gfx1012:xnack- in new release. @network2smb
Hello I tryed to run ollama on windows with the patch rocm provided in this repo but when I try to start ollama I have this warning :
I have a 32GB of ram laptop with 16GB allocated to the system and 16GB to the igpu through this other project
can someone help me to use my igpu am tying to have inference on phi3:medium-128k