Open gabilanbrc opened 1 month ago
Also after switching to gemma2:2b i get faster results but the error remains
2024-09-03 17:53:22 2024-09-03 20:53:22,200 - wren-ai-service - ERROR - Error in GenerationPostProcessor: 'results' (post_processors.py:48) 2024-09-03 17:53:22 Traceback (most recent call last): 2024-09-03 17:53:22 File "/src/pipelines/ask/components/post_processors.py", line 29, in run 2024-09-03 17:53:22 cleaned_generation_result = orjson.loads( 2024-09-03 17:53:22 ^^^^^^^^^^^^^ 2024-09-03 17:53:22 KeyError: 'results' 2024-09-03 17:53:22 2024-09-03 20:53:22,200 - wren-ai-service - ERROR - ask pipeline - NO_RELEVANT_SQL: Which is the oldest team? (ask.py:233)
@gabilanbrc
thanks for reaching out! Actually it's a known issue and the root cause is some parts of the ai pipeline are too complex for weaker LLMs. The good news is we have improved the ai pipeline these days and will alleviate this issue in the next release! Stay tuned!
@gabilanbrc could u try install the latest version and try again?
Hi!
I did and now the error is different
2024-09-05 13:58:06 ******************************************************************************** 2024-09-05 13:58:06 > construct_retrieval_results [src.pipelines.ask.retrieval.construct_retrieval_results()] encountered an error< 2024-09-05 13:58:06 > Node inputs: 2024-09-05 13:58:06 {'construct_db_schemas': "<Task finished name='Task-275' " 2024-09-05 13:58:06 'coro=<AsyncGraphAda...', 2024-09-05 13:58:06 'dbschema_retrieval': "<Task finished name='Task-274' coro=<AsyncGraphAda...", 2024-09-05 13:58:06 'filter_columns_in_tables': "<Task finished name='Task-277' " 2024-09-05 13:58:06 'coro=<AsyncGraphAda...'} 2024-09-05 13:58:06 ******************************************************************************** 2024-09-05 13:58:06 Traceback (most recent call last): 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/hamilton/async_driver.py", line 122, in new_fn 2024-09-05 13:58:06 await fn(**fn_kwargs) if asyncio.iscoroutinefunction(fn) else fn(**fn_kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/src/utils.py", line 95, in wrapper_timer 2024-09-05 13:58:06 return func(*args, **kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 225, in sync_wrapper 2024-09-05 13:58:06 self._handle_exception(observation, e) 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 428, in _handle_exception 2024-09-05 13:58:06 raise e 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 223, in sync_wrapper 2024-09-05 13:58:06 result = func(*args, **kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/src/pipelines/ask/retrieval.py", line 243, in construct_retrieval_results 2024-09-05 13:58:06 columns_and_tables_needed = orjson.loads(filter_columns_in_tables["replies"][0])[ 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 KeyError: 'results' 2024-09-05 13:58:06 ------------------------------------------------------------------- 2024-09-05 13:58:06 Oh no an error! Need help with Hamilton? 2024-09-05 13:58:06 Join our slack and ask for help! https://join.slack.com/t/hamilton-opensource/shared_invite/zt-1bjs72asx-wcUTgH7q7QX1igiQ5bbdcg 2024-09-05 13:58:06 ------------------------------------------------------------------- 2024-09-05 13:58:06 2024-09-05 13:58:06 2024-09-05 16:58:06,703 - wren-ai-service - ERROR - ask pipeline - OTHERS: 'results' (ask.py:307) 2024-09-05 13:58:06 Traceback (most recent call last): 2024-09-05 13:58:06 File "/src/web/v1/services/ask.py", line 145, in ask 2024-09-05 13:58:06 retrieval_result = await self._pipelines["retrieval"].run( 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/src/utils.py", line 119, in wrapper_timer 2024-09-05 13:58:06 return await process(func, *args, **kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/src/utils.py", line 103, in process 2024-09-05 13:58:06 return await func(*args, **kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 188, in async_wrapper 2024-09-05 13:58:06 self._handle_exception(observation, e) 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 428, in _handle_exception 2024-09-05 13:58:06 raise e 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 186, in async_wrapper 2024-09-05 13:58:06 result = await func(*args, **kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/src/pipelines/ask/retrieval.py", line 341, in run 2024-09-05 13:58:06 return await self._pipe.execute( 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/hamilton/async_driver.py", line 368, in execute 2024-09-05 13:58:06 raise e 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/hamilton/async_driver.py", line 359, in execute 2024-09-05 13:58:06 outputs = await self.raw_execute(final_vars, overrides, display_graph, inputs=inputs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/hamilton/async_driver.py", line 320, in raw_execute 2024-09-05 13:58:06 raise e 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/hamilton/async_driver.py", line 315, in raw_execute 2024-09-05 13:58:06 results = await await_dict_of_tasks(task_dict) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/hamilton/async_driver.py", line 23, in await_dict_of_tasks 2024-09-05 13:58:06 coroutines_gathered = await asyncio.gather(*coroutines) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/hamilton/async_driver.py", line 36, in process_value 2024-09-05 13:58:06 return await val 2024-09-05 13:58:06 ^^^^^^^^^ 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/hamilton/async_driver.py", line 122, in new_fn 2024-09-05 13:58:06 await fn(**fn_kwargs) if asyncio.iscoroutinefunction(fn) else fn(**fn_kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/src/utils.py", line 95, in wrapper_timer 2024-09-05 13:58:06 return func(*args, **kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 225, in sync_wrapper 2024-09-05 13:58:06 self._handle_exception(observation, e) 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 428, in _handle_exception 2024-09-05 13:58:06 raise e 2024-09-05 13:58:06 File "/app/.venv/lib/python3.12/site-packages/langfuse/decorators/langfuse_decorator.py", line 223, in sync_wrapper 2024-09-05 13:58:06 result = func(*args, **kwargs) 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 File "/src/pipelines/ask/retrieval.py", line 243, in construct_retrieval_results 2024-09-05 13:58:06 columns_and_tables_needed = orjson.loads(filter_columns_in_tables["replies"][0])[ 2024-09-05 13:58:06 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2024-09-05 13:58:06 KeyError: 'results' 2024-09-05 13:58:07 INFO: 172.19.0.6:50196 - "GET /v1/asks/e3c5cc21-beee-43af-8300-c202c8686656/result HTTP/1.1" 200 OK
The config I used was
LLM_PROVIDER=ollama_llm # ollama_llm, azure_openai_llm, ollama_llm GENERATION_MODEL=llama3.1:8b # gemma2:2b, llama3.1:8b GENERATION_MODEL_KWARGS={"temperature": 0, "n": 1, "max_tokens": 4096, "response_format": {"type": "json_object"}} COLUMN_INDEXING_BATCH_SIZE=50 TABLE_RETRIEVAL_SIZE=10 TABLE_COLUMN_RETRIEVAL_SIZE=1000
LLM_OLLAMA_URL=http://host.docker.internal:11434
EMBEDDER_PROVIDER=ollama_embedder # openai_embedder, azure_openai_embedder, ollama_embedder EMBEDDING_MODEL=mxbai-embed-large EMBEDDING_MODEL_DIMENSION=1024
For the purpose of initial testing I can try to use the Free API from Gemini, can someone give me an example of a configuration for that?
I`m have the same error...
@gabilanbrc @nyeeldzn could you try the latest version and check if the error happen again?
@gabilanbrc for setting up gemini, here is instructions: https://docs.getwren.ai/oss/installation/custom_llm#running-wren-ai-with-your-custom-llm-or-document-store, and now we only support text generation models
relevant instructions to setup gemini: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library#generativeaionvertexai_gemini_chat_completions_non_streaming-python_vertex_ai_sdk
and the .env.ai
should look like this:
## LLM
# openai_llm, azure_openai_llm, ollama_llm
LLM_PROVIDER=openai_llm
LLM_TIMEOUT=120
GENERATION_MODEL=google/gemini-1.5-flash-002
GENERATION_MODEL_KWARGS={"temperature": 0, "n": 1, "max_tokens": 4096, "response_format": {"type": "json_object"}}
# openai or openai-api-compatible
LLM_OPENAI_API_KEY={API_KEY}
LLM_OPENAI_API_BASE=https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/openapi
# azure_openai
LLM_AZURE_OPENAI_API_KEY=
LLM_AZURE_OPENAI_API_BASE=
LLM_AZURE_OPENAI_VERSION=
# ollama
LLM_OLLAMA_URL=http://host.docker.internal:11434/
## EMBEDDER
# openai_embedder, azure_openai_embedder, ollama_embedder
EMBEDDER_PROVIDER=openai_embedder
EMBEDDER_TIMEOUT=120
# supported embedding models providers by qdrant: https://qdrant.tech/documentation/embeddings/
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_MODEL_DIMENSION=3072
# openai or openai-api-compatible
EMBEDDER_OPENAI_API_KEY=<API_KEY>
EMBEDDER_OPENAI_API_BASE=https://api.openai.com/v1
# azure_openai
EMBEDDER_AZURE_OPENAI_API_KEY=
EMBEDDER_AZURE_OPENAI_API_BASE=
EMBEDDER_AZURE_OPENAI_VERSION=
# ollama
EMBEDDER_OLLAMA_URL=http://host.docker.internal:11434/
## DOCUMENT_STORE
DOCUMENT_STORE_PROVIDER=qdrant
QDRANT_HOST=qdrant
## Langfuse: https://langfuse.com/
# empty means disabled
LANGFUSE_ENABLE=
LANGFUSE_SECRET_KEY=
LANGFUSE_PUBLIC_KEY=
LANGFUSE_HOST=https://cloud.langfuse.com
Hi Team, I have just installed Wren with Ollama with this config
LLM_PROVIDER=ollama_llm GENERATION_MODEL=mistral-nemo:latest EMBEDDER_PROVIDER=ollama_embedder EMBEDDING_MODEL=mxbai-embed-large EMBEDDING_MODEL_DIMENSION=1024
None of the three examples of the E-Commerce (or NBA) worked for me with (after more than one minute) the error:2024-09-03 17:02:29 2024-09-03 20:02:29,912 - wren-ai-service - ERROR - ask pipeline - NO_RELEVANT_SQL: What are the top 3 value for orders placed by customers in each city? (ask.py:233)
I have managed to make it work only with a simple question like How many orders are there? That also took more than 1 minute to reply 181
Should I change the EMBEDDING_MODEL_DIMENSION or something else? Thanks in advance