Ollama is not recognised by Devika on my local machine

kiran-chinthala commented 5 months ago

python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible
Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
Avoid using tokenizers before the fork if possible
Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}

24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"}

24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null}

24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3}

Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, **self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate'

I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py

        "OLLAMA": [
            ("mistral", "mistral"),
            ("llama2", "llama2"),
        ]
    }

$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"

obliviousz commented 5 months ago

So you did ollama serve and then tried to run python3 devika.py or reverse?

kiran-chinthala commented 5 months ago

So you did ollama serve and then tried to run python3 devika.py or reverse?

Yes, I did the same,First Ollama serve and then I have started devika server.

obliviousz commented 5 months ago

Does

Ollama run llama2

work in your machine?

kiran-chinthala commented 5 months ago

Does

Ollama run llama2

work in your machine?

Yes it does, below is the command & its output

$ ollama run llama2

tell me a joke Sure, here's one:

Why don't scientists trust atoms? Because they make up everything!

I hope that brought a smile to your face!

Send a message (/? for help)

ShiFangJuMie commented 5 months ago

24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!

docker exec -it devika-devika-backend-engine-1 bash
nonroot@60c02bb85332:~$ curl -f http://ollama:11434
Ollama is running

alexdodson commented 5 months ago

There was a change to src/llm/ollama_client.py in commit 7cd567b . Could it be that change? I am unsure if it was working previously though.

samuelbirocchi commented 5 months ago

24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!
docker exec -it devika-devika-backend-engine-1 bash
nonroot@60c02bb85332:~$ curl -f http://ollama:11434
Ollama is running

I think that ollama with docker will only work if you run it using the docker-compose file

ARajgor commented 5 months ago

you have to update ollama url if it's not the default one

python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}

24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"}

24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null}

24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3}

Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, **self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate'

I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py
        "OLLAMA": [
            ("mistral", "mistral"),
            ("llama2", "llama2"),
        ]
    }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"

as in your log it has OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]] means it connect to ollama and fetch models. so what's the problem

ARajgor commented 5 months ago

https://github.com/stitionai/devika/issues/300

ShiFangJuMie commented 5 months ago

24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!
docker exec -it devika-devika-backend-engine-1 bash
nonroot@60c02bb85332:~$ curl -f http://ollama:11434
Ollama is running
I think that ollama with docker will only work if you run it using the docker-compose file

Ollama and Devika are on the same network, and can be accessed from the Devika container using curl

kiran-chinthala commented 5 months ago

you have to update ollama url if it's not the default one
python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}

24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"} 24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3} Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate' I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py**
        "OLLAMA": [
            ("mistral", "mistral"),
            ("llama2", "llama2"),
        ]
    }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"
as in your log it has OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]] means it connect to ollama and fetch models. so what's the problem

Couple of Questions:

I don't want to use docker to configure the application, I am trying to start the BE server from code. Then what is the solution to recognise local ollama server.
I have manually added these entries in the llm.py file, it is not fetch from the ollama server.

kiran-chinthala commented 5 months ago

Ollama and Devika are on the same network, and can be accessed from the Devika container using curl

I agree with your point, if those are on same network it identifies. But incase I wanted to try using code, not the docker engine. How can I combine these two entities as one. Devika code + ollama server (separately installed). Please quote me on this scenario. Thanks

ChanghongYangR commented 5 months ago

has anyone had a solution yet?

ARajgor commented 5 months ago

you have to update ollama url if it's not the default one
python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}

24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"} 24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3} Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate' I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py**
        "OLLAMA": [
            ("mistral", "mistral"),
            ("llama2", "llama2"),
        ]
    }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"
as in your log it has OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]] means it connect to ollama and fetch models. so what's the problem
Couple of Questions:

I don't want to use docker to configure the application, I am trying to start the BE server from code. Then what is the solution to recognise local ollama server.

I have manually added these entries in the llm.py file, it is not fetch from the ollama server.

share the list of models in the Ollama terminal. cause if there are any Ollama models present then the lib automatically fetches the models. Also are you using latest version of ollama?

ARajgor commented 4 months ago

Does this issue still persist? if so then can you run this code,

import ollama
client = ollama.Client()
print(client.list()["models"]

and in terminal ->

ollama # check if it's installed on your system
ollama list

stitionai / devika

Ollama is not recognised by Devika on my local machine #299