sentient-engineering / sentient

the framework/ sdk that lets you build browser controlling agents in 3 lines of code. join chat @ https://discord.gg/umgnyQU2K8
MIT License
343 stars 35 forks source link

Ollama with llama3.1 not working #13

Open gavinblair opened 1 week ago

gavinblair commented 1 week ago

Here is the output I get, running with Ollama locally (just the example from the README)

Starting orchestrator
Browser started and ready
Executing command play shape of you on youtube
==================================================
Current State: agentq_base
Agent: sentient
Current Thought:
Plan: none
Completed Tasks: none
==================================================
Error executing the command play shape of you on youtube: RetryError[<Future at 0x10fd8d090 state=finished raised ValidationError>]
nischalj10 commented 1 week ago

hey @gavinblair - this primarily stems from the fact that the model was not able to generate a valid output. can you tell me which quantised version of llama 3.1 are you using?

gavinblair commented 1 week ago

8B. I'm using Q4_0. I'll try with Q5_K_M once I figure out how to use a different base url.

nischalj10 commented 6 days ago

maybe try 8b-instruct-q4_0, folks in the community have been able to make it work with llama 3.1 8b models

s-github-2 commented 6 days ago

I had the Get RetryError[<Future at 0x182e2357a60 state=finished raised ValidationError>] with ollama issue filed. The model I was using was llama3:8b. Copiee below is the partial output from ollama serve command ran in terminal

llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str = llama llama_model_loader: - kv 1: general.name str = Meta-Llama-3-8B-Instruct llama_model_loader: - kv 2: llama.block_count u32 = 32 llama_model_loader: - kv 3: llama.context_length u32 = 8192 llama_model_loader: - kv 4: llama.embedding_length u32 = 4096 llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336 llama_model_loader: - kv 6: llama.attention.head_count u32 = 32 llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8 llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000 llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010 llama_model_loader: - kv 10: general.file_type u32 = 2 llama_model_loader: - kv 11: llama.vocab_size u32 = 128256 llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128 llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2 llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ... llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 17: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "... llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 128000 llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 128009 llama_model_loader: - kv 20: tokenizer.chat_template str = {% set loop_messages = messages %}{% ... llama_model_loader: - kv 21: general.quantization_version u32 = 2 llama_model_loader: - type f32: 65 tensors llama_model_loader: - type q4_0: 225 tensors llama_model_loader: - type q6_K: 1 tensors

s-github-2 commented 6 days ago

I tried llama3.1:8b-instruct-q4_0 and it gave me the same error Starting orchestrator Browser started and ready Executing command play shape of you on youtube

==================================================

Current State: agentq_base Agent: sentient Current Thought: Plan: none Completed Tasks: none

==================================================

Error executing the command play shape of you on youtube: RetryError[<Future at 0x21faa7ade40 state=finished raised ValidationError>]

x676f64 commented 6 days ago

I'm using Q4_0. I'll try with Q5_K_M once I figure out how to use a different base url. image

I tried with q5_k_m and got the same result. I got the same result with q4 as well.

TofailHiary commented 2 days ago

I'm encountering the same issue on Windows 10 with ollama3.1:latest, and I’ve tried other models but faced the same problem. I believe the issue might be related to this code snippet: class OllamaProvider(LLMProvider): def get_client_config(self) -> Dict[str, str]: return { "api_key": "ollama", "base_url": "http://localhost:11434/v1/", } def get_provider_name(self) -> str: return "ollama"

As far as I understand, Ollama doesn’t require an API key, and the base URL when installed locally should be http://localhost:11434.

dditionally, I encountered an authentication error with the Groq API, which I resolved by modifying the provider.py file as follows:

class GroqProvider(LLMProvider): def get_client_config(self) -> Dict[str, str]: return { "api_key": os.environ.get("GROQ_API_KEY"), "base_url": "https://api.groq.com/openai/v1/", } def get_provider_name(self) -> str: return "groq"

I hope this gets resolved soon. If I find a solution, I’ll let you know.