Closed Joly0 closed 4 weeks ago
How long have you waited? The loading bar indicates that a request is still going on. Ollama never spins any fans or gets crazy with any PC components. Have you tried to wait for a bit? After some time, the request should be displayed.
I have waited a few minutes, but nothing happens, just the loading bar. I know ollama doesnt spin any fans, but i am able to see, that the memory usage of my gpu is increasing, when i use ollama directly or open-webui connected to ollama. Thats not the case for your app. Nothing happens. Monitoring my gpu doesnt output anything, nothing is happening. How long should i wait and why would it take so long? If i use open-webui, i get an answer after 30 seconds max. The model needs to be loaded into vram and the answer generated, but that doesnt take minutes, atleast with any other tool it doesnt need that long
Ok. Are you certain that the address is reachable? Can you copy the address you see in the change host dialog, paste it in your browser and wait till it loaded? If everything is reachable and the page returns "Ollama is running", it's an issue with the app. Can you confirm?
Yes, the address is correct. I can even see the models i have downloaded and even new ones i just downloaded through ollama. So the app is correctly fetching the models from the ollama api. Just the chat is not working.
The issue couldn't be reproduces on my end. The chat is working fine. I have to be an issue on your setup. Are you using any proxy or tunneling software?
Nope, no proxy or tunneling software. The ollama server is running on a server in my home network, the same network my smartphone is connected to.
If you have anything i could test or try out, then please tell me
I can only tell you to wait. In my tests, the chat worked just fine. Currently, the app doesn't support streaming (will change soon). The request takes a while (with bad internet and/or server, even longer).
Ok, but neither my internet, nor my server is slow, but ok, i will wait then. Can you tell me, how long it usually takes for you to generate the answer?
Depends on what the model want's to do. You could try asking something like "name your favorite word and just that". When you haven't used the model, that alone takes around five seconds. After that it should be quicker.
Ok, so i opened the app, selected llama3 as the model, entered your example prompt and pressed the send button. I waited for 5 minutes now, and havent gotten anything. There is no activity in the app, other than the loading bar and no activity in either ollama or the host that is running it.
Though, i now notice something weird in ollama. So everytime i send a request to ollama, i get this in the logs with your app:
[GIN] 2024/05/30 - 18:39:34 | 200 | 1.645748ms | 192.168.178.48 | GET "/api/tags"
[GIN] 2024/05/30 - 18:39:43 | 404 | 21.071506ms | 192.168.178.48 | POST "/api/chat"
[GIN] 2024/05/30 - 20:16:58 | 200 | 2.006174ms | 172.17.0.1 | GET "/api/tags"
[GIN] 2024/05/30 - 21:01:15 | 404 | 453.108µs | 192.168.178.48 | POST "/api/chat"
where 192.168.178.48
is the ip of my smartphone i am using. As you can see, the call to /api/tags is working with a 200 return code, while the call to /api/chat is giving back a 404 not found for whatever reason.
And this is the return i get, when prompting from open-webui container:
[GIN] 2024/05/30 - 21:12:05 | 200 | 59.063176712s | 172.17.0.1 | POST "/api/chat"
with the same model
Could you run ollama list
and send me the output? I might have a theory
And you're trying to run which model?
i have tried llama3:8b and zephyr yet, i could try others aswell
I know what the issue is. Past me thought the part behind the colon is unnecessary. So I removed it. Could you try if the following build fixes the issue: https://drive.google.com/file/d/1A22NPsxNFvFxsPYOGlPk-9M2wDfJYXvl/view (or just try a model with "latest")
Yep, i see instant activity on my ollama server, gpu is being loaded and after about 15 seconds i get the answer by your app
Thank you for trying. It's a simple fix, and I'll include it in the next release (in a few days)
Hey, i installed the app yesterday, added my ollama host, which is running on my local server, so i entered ip:port. Then the app proceeded, i was able to see the models i have available and i can select them and started to chat, but when i chat, nothing happens. I see a loading circle but never anything comes up. In the ollama container i can see the api request, but nothing more, gpu doesnt start and no reply comes back.