[BUG]: Agent system messages isn't very clear.

phazei commented 2 weeks ago

How are you running AnythingLLM?

Windows Installer (local)

What happened?

So, I've been trying to get agents to work with a local model. I've tried a few: Llama 3.2 Q8, Llama 3.2 Q6 Abliterated, Qwen2.5 32B Q4_K_M. If these models aren't going to figure it out, then any local model on a 24gb card isn't going to work.

I'm using LM Studio to run the models, and I was looking at the logs.

On a brand new chat I said @agent please list documents

This is all the logs from that session:

2024-10-06 00:25:09  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "temperature": 0,
  "messages": [
    {
      "content": "You are a program which picks the most optimal function and parameters to call.\n      DO NOT HAVE TO PICK A FUNCTION IF IT WILL NOT HELP ANSWER OR FULFILL THE USER'S QUERY.\n      When a function is selection, respond in JSON with no additional text.\n      When there is no relevant function to call - return with a regular chat text response.\n      Your task is to pick a **single** function that we will use to call, if any seem useful or relevant for the user query.\n\n      All JSON responses should have two keys.\n      'name': this is the name of the function name to call. eg: 'web-scraper', 'rag-memory', etc..\n      'arguments': this is an object with the function properties to invoke the function.\n      DO NOT INCLUDE ANY OTHER KEYS IN JSON RESPONSES.\n\n      Here are the available tools you can use an examples of a query and response so you can understand how each one works.\n      -----------\nFunction name: rag-memory\nFunction Description: Search against local documents for context that is relevant to the query or store a snippet of text into memory for retrieval later. Storing information should only be done when the user specifically requests for information to be remembered or saved to long-term memory. You should use this tool before search the internet for information. Do not use this tool unless you are explicity told to 'remember' or 'store' information.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"search\",\n            \"store\"\n        ],\n        \"description\": \"The action we want to take to search for existing similar context or storage of new context.\"\n    },\n    \"content\": {\n        \"type\": \"string\",\n        \"description\": \"The plain text to search our local documents with or to store in our vector database.\"\n    }\n}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"action\":\"search\",\"content\":\"What is AnythingLLM?\"}\nQuery: \"What do you know about Plato's motives?\"\nJSON: {\"action\":\"search\",\"content\":\"What are the facts about Plato's motives?\"}\nQuery: \"Remember that you are a robot\"\nJSON: {\"action\":\"store\",\"content\":\"I am a robot, the user told me that i am.\"}\nQuery: \"Save that to memory please.\"\nJSON: {\"action\":\"store\",\"content\":\"<insert summary of conversation until now>\"}\n-----------\n-----------\nFunction name: document-summarizer\nFunction Description: Can get the list of files available to search with descriptions and can select a single file to open and summarize.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"list\",\n            \"summarize\"\n        ],\n        \"description\": \"The action to take. 'list' will return all files available with their filename and descriptions. 'summarize' will open and summarize the file by the a document name.\"\n    },\n    \"document_filename\": {\n        \"type\": \"string\",\n        \"x-nullable\": true,\n        \"description\": \"The file name of the document you want to get the full content of.\"\n    }\n}\nQuery: \"Summarize example.txt\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"example.txt\"}\nQuery: \"What files can you see?\"\nJSON: {\"action\":\"list\",\"document_filename\":null}\nQuery: \"Tell me about readme.md\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"readme.md\"}\n-----------\n-----------\nFunction name: web-scraping\nFunction Description: Scrapes the content of a webpage or online resource from a provided URL.\nFunction parameters in JSON format:\n{\n    \"url\": {\n        \"type\": \"string\",\n        \"format\": \"uri\",\n        \"description\": \"A complete web address URL including protocol. Assumes https if not provided.\"\n    }\n}\nQuery: \"What is anythingllm.com about?\"\nJSON: {\"url\":\"https://anythingllm.com\"}\nQuery: \"Scrape https://example.com\"\nJSON: {\"url\":\"https://example.com\"}\n-----------\n-----------\nFunction name: web-browsing\nFunction Description: Searches for a given query using a search engine to get better results for the user query.\nFunction parameters in JSON format:\n{\n    \"query\": {\n        \"type\": \"string\",\n        \"description\": \"A search query.\"\n    }\n}\nQuery: \"Who won the world series today?\"\nJSON: {\"query\":\"Winner of today's world series\"}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"query\":\"AnythingLLM\"}\nQuery: \"Current AAPL stock price\"\nJSON: {\"query\":\"AAPL stock price today\"}\n-----------\n\n\n      Now pick a function if there is an appropriate one to use given the last user message and the given conversation so far.",
      "role": "system"
    },
    {
      "content": "@agent please list documents",
      "role": "user"
    }
  ]
}
2024-10-06 00:25:09  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2024-10-06 00:25:10  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:25:10  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-5r5c0mt7twxl2gxh7dv7r",
  "object": "chat.completion",
  "created": 1728199509,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"action\":\"list\",\"document_filename\":null}"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1014,
    "completion_tokens": 10,
    "total_tokens": 1024
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:25:10  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..
2024-10-06 00:25:10  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "messages": [
    {
      "content": "You are a helpful ai assistant who can assist the user and use tools available to help answer the users prompts and questions. Tools will be handled by another assistant and you will simply receive their responses to help answer the user prompt - always try to answer the user's prompt the best you can with the context available to you and your general knowledge.",
      "role": "system"
    },
    {
      "content": "@agent please list documents",
      "role": "user"
    }
  ]
}
2024-10-06 00:25:10  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2024-10-06 00:25:10  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:25:11  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-bp3dfvnepj6jsve66k8m",
  "object": "chat.completion",
  "created": 1728199510,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I don't have direct access to your files or documents, but I can guide you on how to find them.\n\nCould you please specify which type of documents you are looking for (e.g., Word documents, PDFs, Excel spreadsheets)? Also, where did you save the documents (on a computer, in cloud storage, etc.)?\n\nIf you're using Google Drive, OneDrive, or Dropbox, I can provide instructions on how to find and list your documents."
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 88,
    "completion_tokens": 95,
    "total_tokens": 183
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:25:11  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..

Looking at the instructions and the model is responding exactly as it's told, exactly as all the examples given. But all the examples given are wrong. The system message gives this example:

Query: "What files can you see?"
JSON: {"action":"list","document_filename":null}

And the model returned this:

"content": "{\"action\":\"list\",\"document_filename\":null}"

I consulted Claude about it and it said the correct response was actually:

{
  "name": "document-summarizer",
  "arguments": {
    "action": "list",
    "document_filename": null
  }
}

And when I explicitly told the model to use that and it did work. I asked Claude how it knew to do that, and it pointed out the "All JSON responses should have two keys" part, but there's not a single example that actually is provided with that, and it wasn't even clear to me. So I think the agents could work much better on smaller models if that was clear.

If the examples were:

Query: "What files can you see?"
JSON: {"name":"document-summarizer","arguments":{"action":"list","document_filename":null}}

then it would probably work.

Are there known steps to reproduce?

No response

EDIT: note: In all the transactions the logs show both qwen & llama, all these examples came from llama, but it was the same with qwen. Anything just didn't update the model names when I loaded or unloaded in LM Studio.

phazei commented 2 weeks ago

When I prompt it with this: @agent could you please list the documents? Remember to wrap the json in {name:“”, arguments:“”}

It sends the proper response:

2024-10-06 00:49:48  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "temperature": 0,
  "messages": [
    {
      "content": "You are a program which picks the most optimal function and parameters to call.\n      DO NOT HAVE TO PICK A FUNCTION IF IT WILL NOT HELP ANSWER OR FULFILL THE USER'S QUERY.\n      When a function is selection, respond in JSON with no additional text.\n      When there is no relevant function to call - return with a regular chat text response.\n      Your task is to pick a **single** function that we will use to call, if any seem useful or relevant for the user query.\n\n      All JSON responses should have two keys.\n      'name': this is the name of the function name to call. eg: 'web-scraper', 'rag-memory', etc..\n      'arguments': this is an object with the function properties to invoke the function.\n      DO NOT INCLUDE ANY OTHER KEYS IN JSON RESPONSES.\n\n      Here are the available tools you can use an examples of a query and response so you can understand how each one works.\n      -----------\nFunction name: rag-memory\nFunction Description: Search against local documents for context that is relevant to the query or store a snippet of text into memory for retrieval later. Storing information should only be done when the user specifically requests for information to be remembered or saved to long-term memory. You should use this tool before search the internet for information. Do not use this tool unless you are explicity told to 'remember' or 'store' information.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"search\",\n            \"store\"\n        ],\n        \"description\": \"The action we want to take to search for existing similar context or storage of new context.\"\n    },\n    \"content\": {\n        \"type\": \"string\",\n        \"description\": \"The plain text to search our local documents with or to store in our vector database.\"\n    }\n}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"action\":\"search\",\"content\":\"What is AnythingLLM?\"}\nQuery: \"What do you know about Plato's motives?\"\nJSON: {\"action\":\"search\",\"content\":\"What are the facts about Plato's motives?\"}\nQuery: \"Remember that you are a robot\"\nJSON: {\"action\":\"store\",\"content\":\"I am a robot, the user told me that i am.\"}\nQuery: \"Save that to memory please.\"\nJSON: {\"action\":\"store\",\"content\":\"<insert summary of conversation until now>\"}\n-----------\n-----------\nFunction name: document-summarizer\nFunction Description: Can get the list of files available to search with descriptions and can select a single file to open and summarize.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"list\",\n            \"summarize\"\n        ],\n        \"description\": \"The action to take. 'list' will return all files available with their filename and descriptions. 'summarize' will open and summarize the file by the a document name.\"\n    },\n    \"document_filename\": {\n        \"type\": \"string\",\n        \"x-nullable\": true,\n        \"description\": \"The file name of the document you want to get the full content of.\"\n    }\n}\nQuery: \"Summarize example.txt\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"example.txt\"}\nQuery: \"What files can you see?\"\nJSON: {\"action\":\"list\",\"document_filename\":null}\nQuery: \"Tell me about readme.md\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"readme.md\"}\n-----------\n-----------\nFunction name: web-scraping\nFunction Description: Scrapes the content of a webpage or online resource from a provided URL.\nFunction parameters in JSON format:\n{\n    \"url\": {\n        \"type\": \"string\",\n        \"format\": \"uri\",\n        \"description\": \"A complete web address URL including protocol. Assumes https if not provided.\"\n    }\n}\nQuery: \"What is anythingllm.com about?\"\nJSON: {\"url\":\"https://anythingllm.com\"}\nQuery: \"Scrape https://example.com\"\nJSON: {\"url\":\"https://example.com\"}\n-----------\n-----------\nFunction name: web-browsing\nFunction Description: Searches for a given query using a search engine to get better results for the user query.\nFunction parameters in JSON format:\n{\n    \"query\": {\n        \"type\": \"string\",\n        \"description\": \"A search query.\"\n    }\n}\nQuery: \"Who won the world series today?\"\nJSON: {\"query\":\"Winner of today's world series\"}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"query\":\"AnythingLLM\"}\nQuery: \"Current AAPL stock price\"\nJSON: {\"query\":\"AAPL stock price today\"}\n-----------\n\n\n      Now pick a function if there is an appropriate one to use given the last user message and the given conversation so far.",
      "role": "system"
    },
    {
      "content": "@agent could you please list the documents?  Remember to wrap the json in {name:\"\", arguments:\"\"}",
      "role": "user"
    }
  ]
}
2024-10-06 00:49:48  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2024-10-06 00:49:49  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:49:49  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-8vry2t7ywbq8nsosos1gjs",
  "object": "chat.completion",
  "created": 1728200988,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{name: \"document-summarizer\", arguments: {\"action\": \"list\", \"document_filename\": null}}"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1031,
    "completion_tokens": 22,
    "total_tokens": 1053
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:49:49  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..
2024-10-06 00:49:49  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "temperature": 0,
  "messages": [
    {
      "content": "You are a program which picks the most optimal function and parameters to call.\n      DO NOT HAVE TO PICK A FUNCTION IF IT WILL NOT HELP ANSWER OR FULFILL THE USER'S QUERY.\n      When a function is selection, respond in JSON with no additional text.\n      When there is no relevant function to call - return with a regular chat text response.\n      Your task is to pick a **single** function that we will use to call, if any seem useful or relevant for the user query.\n\n      All JSON responses should have two keys.\n      'name': this is the name of the function name to call. eg: 'web-scraper', 'rag-memory', etc..\n      'arguments': this is an object with the function properties to invoke the function.\n      DO NOT INCLUDE ANY OTHER KEYS IN JSON RESPONSES.\n\n      Here are the available tools you can use an examples of a query and response so you can understand how each one works.\n      -----------\nFunction name: rag-memory\nFunction Description: Search against local documents for context that is relevant to the query or store a snippet of text into memory for retrieval later. Storing information should only be done when the user specifically requests for information to be remembered or saved to long-term memory. You should use this tool before search the internet for information. Do not use this tool unless you are explicity told to 'remember' or 'store' information.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"search\",\n            \"store\"\n        ],\n        \"description\": \"The action we want to take to search for existing similar context or storage of new context.\"\n    },\n    \"content\": {\n        \"type\": \"string\",\n        \"description\": \"The plain text to search our local documents with or to store in our vector database.\"\n    }\n}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"action\":\"search\",\"content\":\"What is AnythingLLM?\"}\nQuery: \"What do you know about Plato's motives?\"\nJSON: {\"action\":\"search\",\"content\":\"What are the facts about Plato's motives?\"}\nQuery: \"Remember that you are a robot\"\nJSON: {\"action\":\"store\",\"content\":\"I am a robot, the user told me that i am.\"}\nQuery: \"Save that to memory please.\"\nJSON: {\"action\":\"store\",\"content\":\"<insert summary of conversation until now>\"}\n-----------\n-----------\nFunction name: document-summarizer\nFunction Description: Can get the list of files available to search with descriptions and can select a single file to open and summarize.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"list\",\n            \"summarize\"\n        ],\n        \"description\": \"The action to take. 'list' will return all files available with their filename and descriptions. 'summarize' will open and summarize the file by the a document name.\"\n    },\n    \"document_filename\": {\n        \"type\": \"string\",\n        \"x-nullable\": true,\n        \"description\": \"The file name of the document you want to get the full content of.\"\n    }\n}\nQuery: \"Summarize example.txt\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"example.txt\"}\nQuery: \"What files can you see?\"\nJSON: {\"action\":\"list\",\"document_filename\":null}\nQuery: \"Tell me about readme.md\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"readme.md\"}\n-----------\n-----------\nFunction name: web-scraping\nFunction Description: Scrapes the content of a webpage or online resource from a provided URL.\nFunction parameters in JSON format:\n{\n    \"url\": {\n        \"type\": \"string\",\n        \"format\": \"uri\",\n        \"description\": \"A complete web address URL including protocol. Assumes https if not provided.\"\n    }\n}\nQuery: \"What is anythingllm.com about?\"\nJSON: {\"url\":\"https://anythingllm.com\"}\nQuery: \"Scrape https://example.com\"\nJSON: {\"url\":\"https://example.com\"}\n-----------\n-----------\nFunction name: web-browsing\nFunction Description: Searches for a given query using a search engine to get better results for the user query.\nFunction parameters in JSON format:\n{\n    \"query\": {\n        \"type\": \"string\",\n        \"description\": \"A search query.\"\n    }\n}\nQuery: \"Who won the world series today?\"\nJSON: {\"query\":\"Winner of today's world series\"}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"query\":\"AnythingLLM\"}\nQuery: \"Current AAPL stock price\"\nJSON: {\"query\":\"AAPL stock price today\"}\n-----------\n\n\n      Now pick a function if there is an appropriate one to use given the last user message and the given conversation so far.",
      "role": "system"
    },
    {
      "content": "@agent could you please list the documents?  Remember to wrap the json in {name:\"\", arguments:\"\"}",
      "role": "user"
    }
  ]
}
2024-10-06 00:49:49  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2024-10-06 00:49:49  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:49:50  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-bgyi3j16whf2yng6u69q59",
  "object": "chat.completion",
  "created": 1728200989,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{name: \"document-summarizer\", arguments: {\"action\": \"list\", \"document_filename\": null}}"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1031,
    "completion_tokens": 22,
    "total_tokens": 1053
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:49:50  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..
2024-10-06 00:49:50  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "messages": [
    {
      "content": "You are a helpful ai assistant who can assist the user and use tools available to help answer the users prompts and questions. Tools will be handled by another assistant and you will simply receive their responses to help answer the user prompt - always try to answer the user's prompt the best you can with the context available to you and your general knowledge.",
      "role": "system"
    },
    {
      "content": "@agent could you please list the documents?  Remember to wrap the json in {name:\"\", arguments:\"\"}\n[{\"document_id\":\"4531af57-1bd5-4ea0-b3ff-68c8abf37a41\",\"filename\":\"test.txt\",\"description\":\"Unknown\"}]",
      "role": "user"
    }
  ]
}
2024-10-06 00:49:50  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 2 messages.
2024-10-06 00:49:50  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:49:51  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-x0e91qcuaivjts3g9hz7",
  "object": "chat.completion",
  "created": 1728200990,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I don't have direct access to any documents. However, I can help you format the list of documents in JSON format.\n\nHere is the formatted response:\n\n```\n{\n  \"name\": \"\",\n  \"arguments\": \"\"\n}\n[\n  {\n    \"document_id\": \"4531af57-1bd5-4ea0-b3ff-68c8abf37a41\",\n    \"filename\": \"test.txt\",\n    \"description\": \"Unknown\"\n  }\n]\n```\n\nPlease let me know if you need any further assistance!"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 145,
    "completion_tokens": 110,
    "total_tokens": 255
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:49:51  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..

Although it returns the data as:

"content": "@agent could you please list the documents?  Remember to wrap the json in {name:\"\", arguments:\"\"}\n[{\"document_id\":\"4531af57-1bd5-4ea0-b3ff-68c8abf37a41\",\"filename\":\"test.txt\",\"description\":\"Unknown\"}]"

Kind of appended to my message, which doesn't explicitly tell the model that that is the results from the query, which I think could also confuse it. Perhaps it should be returned in some results object.

phazei commented 2 weeks ago

A further query after it knows the correct format it continued to properly use it: (though it looks like some of it is submitted duplicated once?)

2024-10-06 00:53:17  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "temperature": 0,
  "messages": [
    {
      "content": "You are a program which picks the most optimal function and parameters to call.\n      DO NOT HAVE TO PICK A FUNCTION IF IT WILL NOT HELP ANSWER OR FULFILL THE USER'S QUERY.\n      When a function is selection, respond in JSON with no additional text.\n      When there is no relevant function to call - return with a regular chat text response.\n      Your task is to pick a **single** function that we will use to call, if any seem useful or relevant for the user query.\n\n      All JSON responses should have two keys.\n      'name': this is the name of the function name to call. eg: 'web-scraper', 'rag-memory', etc..\n      'arguments': this is an object with the function properties to invoke the function.\n      DO NOT INCLUDE ANY OTHER KEYS IN JSON RESPONSES.\n\n      Here are the available tools you can use an examples of a query and response so you can understand how each one works.\n      -----------\nFunction name: rag-memory\nFunction Description: Search against local documents for context that is relevant to the query or store a snippet of text into memory for retrieval later. Storing information should only be done when the user specifically requests for information to be remembered or saved to long-term memory. You should use this tool before search the internet for information. Do not use this tool unless you are explicity told to 'remember' or 'store' information.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"search\",\n            \"store\"\n        ],\n        \"description\": \"The action we want to take to search for existing similar context or storage of new context.\"\n    },\n    \"content\": {\n        \"type\": \"string\",\n        \"description\": \"The plain text to search our local documents with or to store in our vector database.\"\n    }\n}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"action\":\"search\",\"content\":\"What is AnythingLLM?\"}\nQuery: \"What do you know about Plato's motives?\"\nJSON: {\"action\":\"search\",\"content\":\"What are the facts about Plato's motives?\"}\nQuery: \"Remember that you are a robot\"\nJSON: {\"action\":\"store\",\"content\":\"I am a robot, the user told me that i am.\"}\nQuery: \"Save that to memory please.\"\nJSON: {\"action\":\"store\",\"content\":\"<insert summary of conversation until now>\"}\n-----------\n-----------\nFunction name: document-summarizer\nFunction Description: Can get the list of files available to search with descriptions and can select a single file to open and summarize.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"list\",\n            \"summarize\"\n        ],\n        \"description\": \"The action to take. 'list' will return all files available with their filename and descriptions. 'summarize' will open and summarize the file by the a document name.\"\n    },\n    \"document_filename\": {\n        \"type\": \"string\",\n        \"x-nullable\": true,\n        \"description\": \"The file name of the document you want to get the full content of.\"\n    }\n}\nQuery: \"Summarize example.txt\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"example.txt\"}\nQuery: \"What files can you see?\"\nJSON: {\"action\":\"list\",\"document_filename\":null}\nQuery: \"Tell me about readme.md\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"readme.md\"}\n-----------\n-----------\nFunction name: web-scraping\nFunction Description: Scrapes the content of a webpage or online resource from a provided URL.\nFunction parameters in JSON format:\n{\n    \"url\": {\n        \"type\": \"string\",\n        \"format\": \"uri\",\n        \"description\": \"A complete web address URL including protocol. Assumes https if not provided.\"\n    }\n}\nQuery: \"What is anythingllm.com about?\"\nJSON: {\"url\":\"https://anythingllm.com\"}\nQuery: \"Scrape https://example.com\"\nJSON: {\"url\":\"https://example.com\"}\n-----------\n-----------\nFunction name: web-browsing\nFunction Description: Searches for a given query using a search engine to get better results for the user query.\nFunction parameters in JSON format:\n{\n    \"query\": {\n        \"type\": \"string\",\n        \"description\": \"A search query.\"\n    }\n}\nQuery: \"Who won the world series today?\"\nJSON: {\"query\":\"Winner of today's world series\"}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"query\":\"AnythingLLM\"}\nQuery: \"Current AAPL stock price\"\nJSON: {\"query\":\"AAPL stock price today\"}\n-----------\n\n\n      Now pick a function if there is an appropriate one to use given the last user message and the given conversation so far.",
      "role": "system"
    },
    {
      "content": "@agent could you please list the documents?  Remember to wrap the json in {name:\"\", arguments:\"\"}",
      "role": "user"
    },
    {
      "content": "I don't have direct access to any documents. However, I can help you format the list of documents in JSON format.\n\nHere is the formatted response:\n\n```\n{\n  \"name\": \"\",\n  \"arguments\": \"\"\n}\n[\n  {\n    \"document_id\": \"4531af57-1bd5-4ea0-b3ff-68c8abf37a41\",\n    \"filename\": \"test.txt\",\n    \"description\": \"Unknown\"\n  }\n]\n```\n\nPlease let me know if you need any further assistance!",
      "role": "assistant"
    },
    {
      "content": "what is in that file?",
      "role": "user"
    }
  ]
}
2024-10-06 00:53:17  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 4 messages.
2024-10-06 00:53:18  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:53:19  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-q4p1kxpjue58ylpgwx12",
  "object": "chat.completion",
  "created": 1728201197,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Since I don't have direct access to the document, I can try to summarize it for you. However, I don't have enough context about the content of the file.\n\nBut I can suggest using the `document-summarizer` function to get a summary of the file. Here's an example:\n\n```\n{\n  \"name\": \"document-summarizer\",\n  \"arguments\": {\n    \"action\": \"summarize\",\n    \"document_filename\": \"test.txt\"\n  }\n}\n```"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1157,
    "completion_tokens": 101,
    "total_tokens": 1258
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:53:19  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..
2024-10-06 00:53:20  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "temperature": 0,
  "messages": [
    {
      "content": "You are a program which picks the most optimal function and parameters to call.\n      DO NOT HAVE TO PICK A FUNCTION IF IT WILL NOT HELP ANSWER OR FULFILL THE USER'S QUERY.\n      When a function is selection, respond in JSON with no additional text.\n      When there is no relevant function to call - return with a regular chat text response.\n      Your task is to pick a **single** function that we will use to call, if any seem useful or relevant for the user query.\n\n      All JSON responses should have two keys.\n      'name': this is the name of the function name to call. eg: 'web-scraper', 'rag-memory', etc..\n      'arguments': this is an object with the function properties to invoke the function.\n      DO NOT INCLUDE ANY OTHER KEYS IN JSON RESPONSES.\n\n      Here are the available tools you can use an examples of a query and response so you can understand how each one works.\n      -----------\nFunction name: rag-memory\nFunction Description: Search against local documents for context that is relevant to the query or store a snippet of text into memory for retrieval later. Storing information should only be done when the user specifically requests for information to be remembered or saved to long-term memory. You should use this tool before search the internet for information. Do not use this tool unless you are explicity told to 'remember' or 'store' information.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"search\",\n            \"store\"\n        ],\n        \"description\": \"The action we want to take to search for existing similar context or storage of new context.\"\n    },\n    \"content\": {\n        \"type\": \"string\",\n        \"description\": \"The plain text to search our local documents with or to store in our vector database.\"\n    }\n}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"action\":\"search\",\"content\":\"What is AnythingLLM?\"}\nQuery: \"What do you know about Plato's motives?\"\nJSON: {\"action\":\"search\",\"content\":\"What are the facts about Plato's motives?\"}\nQuery: \"Remember that you are a robot\"\nJSON: {\"action\":\"store\",\"content\":\"I am a robot, the user told me that i am.\"}\nQuery: \"Save that to memory please.\"\nJSON: {\"action\":\"store\",\"content\":\"<insert summary of conversation until now>\"}\n-----------\n-----------\nFunction name: document-summarizer\nFunction Description: Can get the list of files available to search with descriptions and can select a single file to open and summarize.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"list\",\n            \"summarize\"\n        ],\n        \"description\": \"The action to take. 'list' will return all files available with their filename and descriptions. 'summarize' will open and summarize the file by the a document name.\"\n    },\n    \"document_filename\": {\n        \"type\": \"string\",\n        \"x-nullable\": true,\n        \"description\": \"The file name of the document you want to get the full content of.\"\n    }\n}\nQuery: \"Summarize example.txt\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"example.txt\"}\nQuery: \"What files can you see?\"\nJSON: {\"action\":\"list\",\"document_filename\":null}\nQuery: \"Tell me about readme.md\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"readme.md\"}\n-----------\n-----------\nFunction name: web-scraping\nFunction Description: Scrapes the content of a webpage or online resource from a provided URL.\nFunction parameters in JSON format:\n{\n    \"url\": {\n        \"type\": \"string\",\n        \"format\": \"uri\",\n        \"description\": \"A complete web address URL including protocol. Assumes https if not provided.\"\n    }\n}\nQuery: \"What is anythingllm.com about?\"\nJSON: {\"url\":\"https://anythingllm.com\"}\nQuery: \"Scrape https://example.com\"\nJSON: {\"url\":\"https://example.com\"}\n-----------\n-----------\nFunction name: web-browsing\nFunction Description: Searches for a given query using a search engine to get better results for the user query.\nFunction parameters in JSON format:\n{\n    \"query\": {\n        \"type\": \"string\",\n        \"description\": \"A search query.\"\n    }\n}\nQuery: \"Who won the world series today?\"\nJSON: {\"query\":\"Winner of today's world series\"}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"query\":\"AnythingLLM\"}\nQuery: \"Current AAPL stock price\"\nJSON: {\"query\":\"AAPL stock price today\"}\n-----------\n\n\n      Now pick a function if there is an appropriate one to use given the last user message and the given conversation so far.",
      "role": "system"
    },
    {
      "content": "@agent could you please list the documents?  Remember to wrap the json in {name:\"\", arguments:\"\"}",
      "role": "user"
    },
    {
      "content": "I don't have direct access to any documents. However, I can help you format the list of documents in JSON format.\n\nHere is the formatted response:\n\n```\n{\n  \"name\": \"\",\n  \"arguments\": \"\"\n}\n[\n  {\n    \"document_id\": \"4531af57-1bd5-4ea0-b3ff-68c8abf37a41\",\n    \"filename\": \"test.txt\",\n    \"description\": \"Unknown\"\n  }\n]\n```\n\nPlease let me know if you need any further assistance!",
      "role": "assistant"
    },
    {
      "content": "what is in that file?",
      "role": "user"
    }
  ]
}
2024-10-06 00:53:20  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 4 messages.
2024-10-06 00:53:20  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:53:22  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-ex3nhlvvnvrnxzokqf5hun",
  "object": "chat.completion",
  "created": 1728201200,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Since I don't have direct access to the document, I can try to summarize it for you. However, I don't have enough context about the specific file \"test.txt\". Can you please provide more information or clarify which file you are referring to?\n\nIf you'd like, I can also use the `document-summarizer` function to get a summary of the document. Here's an example response:\n\n```\n{\n  \"name\": \"document-summarizer\",\n  \"arguments\": {\n    \"action\": \"list\",\n    \"document_filename\": null\n  }\n}\n```\n\nPlease let me know if you'd like me to proceed with summarizing a specific file!"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1157,
    "completion_tokens": 137,
    "total_tokens": 1294
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:53:22  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..
2024-10-06 00:53:22  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "temperature": 0,
  "messages": [
    {
      "content": "You are a program which picks the most optimal function and parameters to call.\n      DO NOT HAVE TO PICK A FUNCTION IF IT WILL NOT HELP ANSWER OR FULFILL THE USER'S QUERY.\n      When a function is selection, respond in JSON with no additional text.\n      When there is no relevant function to call - return with a regular chat text response.\n      Your task is to pick a **single** function that we will use to call, if any seem useful or relevant for the user query.\n\n      All JSON responses should have two keys.\n      'name': this is the name of the function name to call. eg: 'web-scraper', 'rag-memory', etc..\n      'arguments': this is an object with the function properties to invoke the function.\n      DO NOT INCLUDE ANY OTHER KEYS IN JSON RESPONSES.\n\n      Here are the available tools you can use an examples of a query and response so you can understand how each one works.\n      -----------\nFunction name: rag-memory\nFunction Description: Search against local documents for context that is relevant to the query or store a snippet of text into memory for retrieval later. Storing information should only be done when the user specifically requests for information to be remembered or saved to long-term memory. You should use this tool before search the internet for information. Do not use this tool unless you are explicity told to 'remember' or 'store' information.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"search\",\n            \"store\"\n        ],\n        \"description\": \"The action we want to take to search for existing similar context or storage of new context.\"\n    },\n    \"content\": {\n        \"type\": \"string\",\n        \"description\": \"The plain text to search our local documents with or to store in our vector database.\"\n    }\n}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"action\":\"search\",\"content\":\"What is AnythingLLM?\"}\nQuery: \"What do you know about Plato's motives?\"\nJSON: {\"action\":\"search\",\"content\":\"What are the facts about Plato's motives?\"}\nQuery: \"Remember that you are a robot\"\nJSON: {\"action\":\"store\",\"content\":\"I am a robot, the user told me that i am.\"}\nQuery: \"Save that to memory please.\"\nJSON: {\"action\":\"store\",\"content\":\"<insert summary of conversation until now>\"}\n-----------\n-----------\nFunction name: document-summarizer\nFunction Description: Can get the list of files available to search with descriptions and can select a single file to open and summarize.\nFunction parameters in JSON format:\n{\n    \"action\": {\n        \"type\": \"string\",\n        \"enum\": [\n            \"list\",\n            \"summarize\"\n        ],\n        \"description\": \"The action to take. 'list' will return all files available with their filename and descriptions. 'summarize' will open and summarize the file by the a document name.\"\n    },\n    \"document_filename\": {\n        \"type\": \"string\",\n        \"x-nullable\": true,\n        \"description\": \"The file name of the document you want to get the full content of.\"\n    }\n}\nQuery: \"Summarize example.txt\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"example.txt\"}\nQuery: \"What files can you see?\"\nJSON: {\"action\":\"list\",\"document_filename\":null}\nQuery: \"Tell me about readme.md\"\nJSON: {\"action\":\"summarize\",\"document_filename\":\"readme.md\"}\n-----------\n-----------\nFunction name: web-scraping\nFunction Description: Scrapes the content of a webpage or online resource from a provided URL.\nFunction parameters in JSON format:\n{\n    \"url\": {\n        \"type\": \"string\",\n        \"format\": \"uri\",\n        \"description\": \"A complete web address URL including protocol. Assumes https if not provided.\"\n    }\n}\nQuery: \"What is anythingllm.com about?\"\nJSON: {\"url\":\"https://anythingllm.com\"}\nQuery: \"Scrape https://example.com\"\nJSON: {\"url\":\"https://example.com\"}\n-----------\n-----------\nFunction name: web-browsing\nFunction Description: Searches for a given query using a search engine to get better results for the user query.\nFunction parameters in JSON format:\n{\n    \"query\": {\n        \"type\": \"string\",\n        \"description\": \"A search query.\"\n    }\n}\nQuery: \"Who won the world series today?\"\nJSON: {\"query\":\"Winner of today's world series\"}\nQuery: \"What is AnythingLLM?\"\nJSON: {\"query\":\"AnythingLLM\"}\nQuery: \"Current AAPL stock price\"\nJSON: {\"query\":\"AAPL stock price today\"}\n-----------\n\n\n      Now pick a function if there is an appropriate one to use given the last user message and the given conversation so far.",
      "role": "system"
    },
    {
      "content": "@agent could you please list the documents?  Remember to wrap the json in {name:\"\", arguments:\"\"}",
      "role": "user"
    },
    {
      "content": "I don't have direct access to any documents. However, I can help you format the list of documents in JSON format.\n\nHere is the formatted response:\n\n```\n{\n  \"name\": \"\",\n  \"arguments\": \"\"\n}\n[\n  {\n    \"document_id\": \"4531af57-1bd5-4ea0-b3ff-68c8abf37a41\",\n    \"filename\": \"test.txt\",\n    \"description\": \"Unknown\"\n  }\n]\n```\n\nPlease let me know if you need any further assistance!",
      "role": "assistant"
    },
    {
      "content": "what is in that file?",
      "role": "user"
    }
  ]
}
2024-10-06 00:53:22  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 4 messages.
2024-10-06 00:53:22  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:53:25  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-99m4yeophl614t10x9fo5p",
  "object": "chat.completion",
  "created": 1728201202,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Since I don't have direct access to the document, I can try to summarize it for you. However, I don't have enough context about the specific file \"test.txt\". Can you please provide more information or clarify which file you are referring to?\n\nIf you'd like, I can also use the `document-summarizer` function to get a summary of the document. Here's an example response:\n\n```\n{\n  \"name\": \"document-summarizer\",\n  \"arguments\": {\n    \"action\": \"list\",\n    \"document_filename\": null\n  }\n}\n```\n\nPlease let me know if you'd like me to proceed with summarizing a specific file!"
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1157,
    "completion_tokens": 137,
    "total_tokens": 1294
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:53:25  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..
2024-10-06 00:53:25  [INFO]
Received POST request to /v1/chat/completions with body: {
  "model": "qwen2.5-32b-instruct-abliterated-pass2",
  "messages": [
    {
      "content": "You are a helpful ai assistant who can assist the user and use tools available to help answer the users prompts and questions. Tools will be handled by another assistant and you will simply receive their responses to help answer the user prompt - always try to answer the user's prompt the best you can with the context available to you and your general knowledge.",
      "role": "system"
    },
    {
      "content": "@agent could you please list the documents?  Remember to wrap the json in {name:\"\", arguments:\"\"}",
      "role": "user"
    },
    {
      "content": "I don't have direct access to any documents. However, I can help you format the list of documents in JSON format.\n\nHere is the formatted response:\n\n```\n{\n  \"name\": \"\",\n  \"arguments\": \"\"\n}\n[\n  {\n    \"document_id\": \"4531af57-1bd5-4ea0-b3ff-68c8abf37a41\",\n    \"filename\": \"test.txt\",\n    \"description\": \"Unknown\"\n  }\n]\n```\n\nPlease let me know if you need any further assistance!",
      "role": "assistant"
    },
    {
      "content": "what is in that file?\nMary had a little lamb, and then she ate it with milk\n[{\"document_id\":\"4531af57-1bd5-4ea0-b3ff-68c8abf37a41\",\"filename\":\"test.txt\",\"description\":\"Unknown\"}]",
      "role": "user"
    }
  ]
}
2024-10-06 00:53:25  [INFO] [LM STUDIO SERVER] Running chat completion on conversation with 4 messages.
2024-10-06 00:53:25  [INFO] [LM STUDIO SERVER] Accumulating tokens ... (stream = false)
2024-10-06 00:53:26  [INFO]
[LM STUDIO SERVER] [llama-3.2-3b-instruct] Generated prediction: {
  "id": "chatcmpl-5owt78kv5g71ssaxnhrs4f",
  "object": "chat.completion",
  "created": 1728201205,
  "model": "llama-3.2-3b-instruct",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "It appears to be a text file containing a poem. The first line reads: \"Mary had a little lamb, and then she ate it with milk\". \n\nThere is also an embedded list of metadata about the document, which includes:\n\n- A unique identifier for the document (`document_id`)\n- The filename (`filename`) which in this case is \"test.txt\"\n- A description of the document (`description`), which is currently set to \"Unknown\".\n\nIt's unclear what the purpose or meaning of this poem is, but it seems to be a simple and playful example of text."
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 285,
    "completion_tokens": 117,
    "total_tokens": 402
  },
  "system_fingerprint": "llama-3.2-3b-instruct"
}
2024-10-06 00:53:26  [INFO] [LM STUDIO SERVER] Client disconnected. Stopping generation..

The content of the file is correctly Mary had a little lamb, and then she ate it with milk

timothycarambat commented 2 weeks ago

This is a common thing that especially plagues smaller param models. Here you saw better results from more explicitly steering of the prompt and output. That being said we do most of what you already outlined in UnTooled which its entire purpose is to wrap and make every model tool-callable, even those who natively cannot do that. Its not foolproof and often comes down to how the model generates the JSON object, which varies from model to model and its size/quant.

Its not a bug in the traditional sense of something wrong with the code. The model is less capable and needs to be guided more or less depending on how good of a model it is. For example on LLama 3.1 8B Q8 - no issues with such a call.

https://docs.anythingllm.com/agent-not-using-tools

Mintplex-Labs / anything-llm