Open devpramod opened 2 weeks ago
@devpramod can you add a meaningful title to the issue? Do you have host_ip set?
@dcmiddle Yes all other services work fine with host_ip For Ollama, both the backend LLM service (i.e. Ollama itself) and the LLM microservice don't work with host_ip, need to set to localhost
Priority
P2-High
OS type
Ubuntu
Hardware type
AI-PC
Installation method
Deploy method
Running nodes
Single Node
What's the version?
25174c0
Description
In GenAIComps, ollama is tested with localhost:
curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?" }'
` curl http://127.0.0.1:9000/v1/chat/completions -X POST -d '{"model": "llama3", "query":"What is Deep Learning?","max_new_tokens":32,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' -H 'Content-Type: application/json'` which works fine. But in GenAIExamples - instead of localhost, host_ip is used which causes an error (Connection refused)
Reproduce steps
cd GenAIExamples/ChatQnA/docker/aipc
docker compose up -d
run ollama
curl http://${host_ip}:9000/v1/chat/completions\ -X POST \ -d '{"query":"What is Deep Learning?","max_new_tokens":17,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \ -H 'Content-Type: application/json'
Raw log
[No response](curl: (7) Failed to connect to x.x.x.x port 11434 after 0 ms: Connection refused)