[BUG]: agent and llama3

nicho2 commented 5 months ago

How are you running AnythingLLM?

Docker (local)

What happened?

i want use sql agent and llama3 8b instruct q8 and the llm model don't send back a good JSON.

POST /api/chat HTTP/1.1
host: 172.17.0.1:11434
connection: keep-alive
Content-Type: application/json
Accept: application/json
User-Agent: ollama-js/0.5.0 (x64 linux Node.js/v18.19.1)
accept-language: *
sec-fetch-mode: cors
accept-encoding: gzip, deflate
content-length: 10878

{"model":"llama3:8b-instruct-q8_0","messages":[{"content":"You are a program which picks the most optimal function and parameters to call.\n      DO NOT HAVE TO PICK A FUNCTION IF IT WILL NOT HELP ANSWER OR FULFILL THE USER'S QUERY.\n      When a function is selection, respond in JSON with no additional text.\n      When there is no relevant function to call - return with a regular chat text response.\n      Your task is to pick a **single** function that we will use to call, if any seem useful or relevant for the user query.\n\n      All JSON responses should have two keys.\n      'name': this is the name of the function name to call. eg: 'web-scraper', 'rag-memory', etc..\n      'arguments': this is an object with the function properties to invoke the function.\n      DO NOT INCLUDE ANY OTHER KEYS IN JSON RESPONSES.\n\n      Here are the available tools you can use an examples of a query and response so you can understand how each one works.\n      -----------\nFunction name: rag-memory\nFunction Description: Search against local documents for context that is relevant to the query or store a snippet of text into memory for retrieval later. Storing information should only be done when the user specifically requests for information to be remembered or saved to long-term memory. You should use this tool before search the internet for information. Do not use this tool unless you are explicity told to 'remember' or 'store' information.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"search\",\n            \"store\"\n        ],\n        \"description\": \"The action we want to take to search for existing similar context or storage of new context.\"\n    },\n    \"content\": {\n        \"type\": \"string\",\n        \"description\": \"The plain text to search our local documents with or to store in our vector database.\"\n    }\n}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"action\":\"search\",\"content\":\"What is AnythingLLM?\"}\nQuery: \"What do you know about Plato's motives?\"\nJSON: {\"action\":\"search\",\"content\":\"What are the facts about Plato's motives?\"}\nQuery: \"Remember that you are a robot\"\nJSON: {\"action\":\"store\",\"content\":\"I am a robot, the user told me that i am.\"}\nQuery: \"Save that to memory please.\"\nJSON: {\"action\":\"store\",\"content\":\"<insert summary of conversation until now>\"}\n-----------\n-----------\nFunction name: document-summarizer\nFunction Description: Can get the list of files available to search with descriptions and can select a single file to open and summarize.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"list\",\n            \"summarize\"\n        ],\n        \"description\": \"The action to take. 'list' will return all files available with their filename and descriptions. 'summarize' will open and summarize the file by the a document name.\"\n    },\n    \"document_filename\": {\n        \"type\": \"string\",\n        \"x-nullable\": true,\n        \"description\": \"The file name of the document you want to get the full content of.\"\n    }\n}\nQuery: \"Summarize example.txt\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"example.txt\"}\nQuery: \"What files can you see?\"\nJSON: {\"action\":\"list\",\"document_filename\":null}\nQuery: \"Tell me about readme.md\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"readme.md\"}\n-----------\n-----------\nFunction name: web-scraping\nFunction Description: Scrapes the content of a webpage or online resource from a provided URL.\nFunction parameters in JSON format:\n{\n    \"url\": {\n        \"type\": \"string\",\n        \"format\": \"uri\",\n        \"description\": \"A complete web address URL including protocol. Assumes https if not provided.\"\n    }\n}\nQuery: \"What is useanything.com about?\"\nJSON: {\"url\":\"https://useanything.com\"}\nQuery: \"Scrape https://example.com\"\nJSON: {\"url\":\"https://example.com\"}\n-----------\n-----------\nFunction name: web-browsing\nFunction Description: Searches for a given query using a search engine to get better results for the user query.\nFunction parameters in JSON format:\n{\n    \"query\": {\n        \"type\": \"string\",\n        \"description\": \"A search query.\"\n    }\n}\nQuery: \"Who won the world series today?\"\nJSON: {\"query\":\"Winner of today's world series\"}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"query\":\"AnythingLLM\"}\nQuery: \"Current AAPL stock price\"\nJSON: {\"query\":\"AAPL stock price today\"}\n-----------\n-----------\nFunction name: sql-list-databases\nFunction Description: List all available databases via `list_databases` you currently have access to. Returns a unique string identifier `database_id` that can be used for future calls.\nFunction parameters in JSON format:\n{}\nQuery: \"What databases can you access?\"\nJSON: {}\nQuery: \"What databases can you tell me about?\"\nJSON: {}\nQuery: \"Is there a database named erp-logs you can access?\"\nJSON: {}\n-----------\n-----------\nFunction name: sql-list-tables\nFunction Description: List all available tables in a database via its `database_id`.\nFunction parameters in JSON format:\n{\n    \"database_id\": {\n        \"type\": \"string\",\n        \"description\": \"The database identifier for which we will list all tables for. This is a required parameter\"\n    }\n}\nQuery: \"What tables are there in the `access-logs` database?\"\nJSON: {\"database_id\":\"access-logs\"}\nQuery: \"What information can you access in the customer_accts postgres db?\"\nJSON: {\"database_id\":\"customer_accts\"}\nQuery: \"Can you tell me what is in the primary-logs db?\"\nJSON: {\"database_id\":\"primary-logs\"}\n-----------\n-----------\nFunction name: sql-get-table-schema\nFunction Description: Gets the table schema in SQL for a given `table` and `database_id`\nFunction parameters in JSON format:\n{\n    \"database_id\": {\n        \"type\": \"string\",\n        \"description\": \"The database identifier for which we will connect to to query the table schema. This is a required field.\"\n    },\n    \"table_name\": {\n        \"type\": \"string\",\n        \"description\": \"The database identifier for the table name we want the schema for. This is a required field.\"\n    }\n}\nQuery: \"What does the customers table in access-logs look like?\"\nJSON: {\"database_id\":\"access-logs\",\"table_name\":\"customers\"}\nQuery: \"Get me the full name of a company in records-main, the table should be call comps\"\nJSON: {\"database_id\":\"records-main\",\"table_name\":\"comps\"}\n-----------\n-----------\nFunction name: sql-query\nFunction Description: Run a read-only SQL query on a `database_id` which will return up rows of data related to the query. The query must only be SELECT statements which do not modify the table data. There should be a reasonable LIMIT on the return quantity to prevent long-running or queries which crash the db.\nFunction parameters in JSON format:\n{\n    \"database_id\": {\n        \"type\": \"string\",\n        \"description\": \"The database identifier for which we will connect to to query the table schema. This is required to run the SQL query.\"\n    },\n    \"sql_query\": {\n        \"type\": \"string\",\n        \"description\": \"The raw SQL query to run. Should be a query which does not modify the table and will return results.\"\n    }\n}\nQuery: \"How many customers are in dvd-rentals?\"\nJSON: {\"database_id\":\"dvd-rentals\",\"sql_query\":\"SELECT * FROM customers\"}\nQuery: \"Can you tell me the total volume of sales last month?\"\nJSON: {\"database_id\":\"sales-db\",\"sql_query\":\"SELECT SUM(sale_amount) AS total_sales FROM sales WHERE sale_date >= DATEADD(month, -1, DATEFROMPARTS(YEAR(GETDATE()), MONTH(GETDATE()), 1)) AND sale_date < DATEFROMPARTS(YEAR(GETDATE()), MONTH(GETDATE()), 1)\"}\nQuery: \"Do we have anyone in the staff table for our production db named 'sam'? \"\nJSON: {\"database_id\":\"production\",\"sql_query\":\"SElECT * FROM staff WHERE first_name='sam%' OR last_name='sam%'\"}\n-----------\n-----------\nFunction name: create-chart\nFunction Description: Generates the JSON data required to generate a RechartJS chart to the user based on their prompt and available data.\nFunction parameters in JSON format:\n{\n    \"type\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"area\",\n            \"bar\",\n            \"line\",\n            \"composed\",\n            \"scatter\",\n            \"pie\",\n            \"radar\",\n            \"radialBar\",\n            \"treemap\",\n            \"funnel\"\n        ],\n        \"description\": \"The type of chart to be generated.\"\n    },\n    \"title\": {\n        \"type\": \"string\",\n        \"description\": \"Title of the chart. There MUST always be a title. Do not leave it blank.\"\n    },\n    \"dataset\": {\n        \"type\": \"string\",\n        \"description\": \"Valid JSON in which each element is an object for Recharts API for the 'type' of chart defined WITHOUT new line characters. Strictly using this FORMAT and naming:\\n{ \\\"name\\\": \\\"a\\\", \\\"value\\\": 12 }].\\nMake sure field \\\"name\\\" always stays named \\\"name\\\". Instead of naming value field value in JSON, name it based on user metric and make it the same across every item.\\nMake sure the format use double quotes and property names are string literals. Provide JSON data only.\"\n    }\n}\n-----------\n-----------\nFunction name: save-file-to-browser\nFunction Description: Save content to a file when the user explicity asks for a download of the file.\nFunction parameters in JSON format:\n{\n    \"file_content\": {\n        \"type\": \"string\",\n        \"description\": \"The content of the file that will be saved.\"\n    },\n    \"filename\": {\n        \"type\": \"string\",\n        \"description\": \"filename to save the file as with extension. Extension should be plaintext file extension.\"\n    }\n}\nQuery: \"Save me that to a file named 'output'\"\nJSON: {\"file_content\":\"<content of the file we will write previous conversation>\",\"filename\":\"output.txt\"}\nQuery: \"Save me that to my desktop\"\nJSON: {\"file_content\":\"<content of the file we will write previous conversation>\",\"filename\":\"<relevant filename>.txt\"}\nQuery: \"Save me that to a file\"\nJSON: {\"file_content\":\"<content of the file we will write from previous conversation>\",\"filename\":\"<descriptive filename>.txt\"}\n-----------\n\n\n      Now pick a function if there is an appropriate one to use given the last user message and the given conversation so far.","role":"system"},{"content":"@agent donnes moi la liste des tables de la base odbc","role":"user"}],"options":{"temperature":0},"stream":false}HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Date: Thu, 06 Jun 2024 08:53:31 GMT
Content-Length: 726

{"model":"llama3:8b-instruct-q8_0","created_at":"2024-06-06T08:53:31.170548254Z","message":{"role":"assistant","content":"I can help you with that!\n\nBased on your request, I will call the `sql-list-tables` function with the database ID set to \"odbc\".\n\nHere is the JSON payload:\n```\n{\n  \"database_id\": \"odbc\"\n}\n```\nPlease note that this function assumes that the ODBC database is accessible and configured correctly.\n\nIf the function is successful, it will return a list of tables available in the \"odbc\" database."},"done_reason":"stop","done":true,"total_duration":10229377151,"load_duration":6760490788,"prompt_eval_count":1275,"prompt_eval_duration":808422000,"eval_count":94,"eval_duration":2609574000}

the answer content only one key and not the key "name"

{  "database_id": "odbc"}

In the system prompt, there is never a complete example.

Can we have an access to modify the system prompt ?

Are there known steps to reproduce?

No response

nicho2 commented 5 months ago

docker log:

[AgentHandler] Start c52c6ce7-cb59-4c44-8caf-4e3566ce20f3::ollama:llama3:8b-instruct-q8_0
[TELEMETRY SENT] {
  event: 'agent_chat_started',
  distinctId: '98297666-e886-4011-9303-5720b1f81b92',
  properties: { runtime: 'docker' }
}
[AgentHandler] Attached websocket plugin to Agent cluster
[AgentHandler] Attached chat-history plugin to Agent cluster
[AgentHandler] Attaching user and default agent to Agent cluster.
[AgentHandler] Attached rag-memory plugin to Agent cluster
[AgentHandler] Attached document-summarizer plugin to Agent cluster
[AgentHandler] Attached web-scraping plugin to Agent cluster
[AgentHandler] Attached web-browsing plugin to Agent cluster
[AgentHandler] Attached sql-agent:sql-list-databases plugin to Agent cluster
[AgentHandler] Attached sql-agent:sql-list-tables plugin to Agent cluster
[AgentHandler] Attached sql-agent:sql-get-table-schema plugin to Agent cluster
[AgentHandler] Attached sql-agent:sql-query plugin to Agent cluster
[AgentHandler] Attached create-chart plugin to Agent cluster
[AgentHandler] Attached save-file-to-browser plugin to Agent cluster
[AgentLLM - llama3:8b-instruct-q8_0] Invalid function tool call: Missing name or arguments in function call..
[AgentLLM - llama3:8b-instruct-q8_0] Will assume chat completion without tool call inputs.
[TELEMETRY SENT] {
  event: 'agent_chat_sent',
  distinctId: '98297666-e886-4011-9303-5720b1f81b92',
  properties: { runtime: 'docker' }
}
[AgentHandler] End c52c6ce7-cb59-4c44-8caf-4e3566ce20f3::ollama:llama3:8b-instruct-q8_0

nicho2 commented 5 months ago

Also sometimes (with a differents user prompt) , i can have something as this:

{"model":"llama3:8b-instruct-q8_0","created_at":"2024-06-06T11:22:33.802299259Z","message":{"role":"assistant","content":"{\n \"name\": \"sql-list-tables\",\n \"database_id\": \"odbc\"\n}"},"done_reason":"stop","done":true,"total_duration":8570483975,"load_duration":7048436750,"prompt_eval_count":1310,"prompt_eval_duration":836772000,"eval_count":22,"eval_duration":585033000}

or this:

{"model":"llama3:8b-instruct-q8_0","created_at":"2024-06-06T09:37:00.558348692Z","message":{"role":"assistant","content":"I can help you with that!\n\nHere is the JSON response:\n\n{\n \"name\": \"sql-list-tables\",\n \"database_id\": \"odbc\"\n}\n\nThis function will list all available tables in the ODBC database. Let me know if you'd like to proceed with this query!"},"done_reason":"stop","done":true,"total_duration":12991065971,"load_duration":10319062185,"prompt_eval_count":1302,"prompt_eval_duration":837425000,"eval_count":65,"eval_duration":1785383000}

is it possible to parse the message to extract the json if the json is not alone in the message ?

timothycarambat commented 5 months ago

is it possible to parse the message to extract the json if the json is not alone in the message ?

We do this actually already because the JSON output from OSS LLMs is so unreliable. function safeJsonParse(jsonString, fallback = null) {

Output of that text from Replit using that function.

Oddly this line in the logs:

[AgentLLM - llama3:8b-instruct-q8_0] Invalid function tool call: Missing name or arguments in function call..

Is what is going awry. So it seems like even when we do the parse, no tool call is found. The reason is that the real format from the response needs to be:

{ "name": "sql-list-tables", "arguments":{ "database_id": "odbc"}}
// You are getting this, which fails to parse.
{ "name": "sql-list-tables",  "database_id": "odbc"}}

arguments is missing from output. Closing for the moment, but we can continue to discuss on this thread.

KeithHanson commented 3 months ago

Just experienced this after watching your youtube link and using this too :) Going to experiment with other models.

Mintplex-Labs / anything-llm