Open kiran-chinthala opened 5 months ago
So you did ollama serve and then tried to run python3 devika.py or reverse?
So you did ollama serve and then tried to run python3 devika.py or reverse?
Yes, I did the same,First Ollama serve and then I have started devika server.
Does
Ollama run llama2
work in your machine?
Does
Ollama run llama2
work in your machine?
Yes it does, below is the command & its output
$ ollama run llama2
tell me a joke Sure, here's one:
Why don't scientists trust atoms? Because they make up everything!
I hope that brought a smile to your face!
Send a message (/? for help)
24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!
docker exec -it devika-devika-backend-engine-1 bash
nonroot@60c02bb85332:~$ curl -f http://ollama:11434
Ollama is running
There was a change to src/llm/ollama_client.py in commit 7cd567b . Could it be that change? I am unsure if it was working previously though.
24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!
docker exec -it devika-devika-backend-engine-1 bash nonroot@60c02bb85332:~$ curl -f http://ollama:11434 Ollama is running
I think that ollama with docker will only work if you run it using the docker-compose file
you have to update ollama url if it's not the default one
python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}
24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"}
24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null}
24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}
24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}
24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}
24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}
24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}
24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3}
Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, **self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate'
I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py
"OLLAMA": [ ("mistral", "mistral"), ("llama2", "llama2"), ] }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"
as in your log it has OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]]
means it connect to ollama and fetch models. so what's the problem
24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!
docker exec -it devika-devika-backend-engine-1 bash nonroot@60c02bb85332:~$ curl -f http://ollama:11434 Ollama is running
I think that ollama with docker will only work if you run it using the docker-compose file
Ollama and Devika are on the same network, and can be accessed from the Devika container using curl
you have to update ollama url if it's not the default one
python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}
24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"} 24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3} Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate' I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py**
"OLLAMA": [ ("mistral", "mistral"), ("llama2", "llama2"), ] }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"
as in your log it has
OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]]
means it connect to ollama and fetch models. so what's the problem
Couple of Questions:
Ollama and Devika are on the same network, and can be accessed from the Devika container using curl
I agree with your point, if those are on same network it identifies. But incase I wanted to try using code, not the docker engine. How can I combine these two entities as one. Devika code + ollama server (separately installed). Please quote me on this scenario. Thanks
has anyone had a solution yet?
you have to update ollama url if it's not the default one
python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
- Avoid using
tokenizers
before the fork if possible- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}
24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"} 24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3} Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate' I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py**
"OLLAMA": [ ("mistral", "mistral"), ("llama2", "llama2"), ] }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"
as in your log it has
OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]]
means it connect to ollama and fetch models. so what's the problemCouple of Questions:
- I don't want to use docker to configure the application, I am trying to start the BE server from code. Then what is the solution to recognise local ollama server.
- I have manually added these entries in the llm.py file, it is not fetch from the ollama server.
share the list of models in the Ollama terminal. cause if there are any Ollama models present then the lib automatically fetches the models. Also are you using latest version of ollama?
Does this issue still persist? if so then can you run this code,
import ollama
client = ollama.Client()
print(client.list()["models"]
and in terminal ->
ollama # check if it's installed on your system
ollama list
python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
tokenizers
before the fork if possibletokenizers
before the fork if possible24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"}
24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null}
24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}
24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}
24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}
24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}
24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}
24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3}
Exception in thread Thread-3 ():
Traceback (most recent call last):
File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner
self.run()
File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run
self._target(*self._args, **self._kwargs)
File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in
thread = Thread(target=lambda: agent.execute(message, project_name, search_engine))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute
plan = self.planner.execute(prompt, project_name_from_user)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute
response = self.llm.inference(prompt, project_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference
response = model.inference(self.model_id, prompt).strip()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference
response = self.client.generate(
^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'generate'
I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"