sigoden / aichat

All-in-one AI CLI tool featuring Chat-REPL, Shell Assistant, RAG, AI tools & agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
Apache License 2.0
3.96k stars 265 forks source link

Web Search #679

Closed sigoden closed 2 months ago

sigoden commented 3 months ago

Problem 1

Many agents require web search capability.

The implementation of web_search should be flexible. For example, users who prefer DuckDuckGo should be able to use search_duckduckgo as their web search tool, while users who favor Tavily could opt for search_tavily.

Currently, AIChat doesn't offer this capability, let alone flexibility.

Problem 2

Currently, tools are tied to specific roles (via the functions_filter field), and these roles are mutually exclusive.

We could create a dedicated web search role:

- name: web_search
  functions_filter: search_tavily

However, this web search role couldn't be combined with other roles or used within existing sessions.

Solution

Add the following configuration to the config.yaml file:

mapping_tools:
  web_search: 'search_tavily'
  code_interpreter: 'execute_py_code'

Currently, only these three mappers are supported. This restriction may be relaxed in the future, allowing users to add any mapper they need.

The agent definition file (index.yaml) supports the following configuration:

common_tools:
  - web_search
  - code_interpreter

The possible values for agent's common_tools are: web_search and code_interpreter. There are no others.

Then, make the role/session supports a new configuration item: use_tools.

use_tools: web_search
use_tools: web_search,code_interpreter
use_tools: web_search,save_file
use_tools: all

You can change the value of use_tools at any time using .set.

.set use_tools web_search
.set use_tools web_search,save_file
.set use_tools all

Additional Context

Some LLMs natively support web search, such as cohere:command-r* ernie:* perplexity:*-online.

Should there be a way to use this feature?

Final

Welcome all solutions or suggestions.

sudomain commented 3 months ago

Thoughts on using playwright for the search? It's what I see some other projects using

sigoden commented 3 months ago

@sudomain Irrelevant to the topic, you can create a search tool that employs Playwright.

einarpersson commented 3 months ago

I am very excited about the progress of this project, but I would argue that it is cleaner to keep web search as a separate tool as today (it IS a tool), and rather reflect on these questions:

  1. How can agents and roles share tools easily?
  2. How can an ongoing session be 'supercharged' by loading a tool capability temporarily?
  3. Should there be a easy way to install tools from a git repo? Something like aichat --install-agent "<agent-repo-url>" which clones it into the corresponding folder
  4. Should there be a recommended list of tools and agents, for example web_search? Should there even be a prompt at first configuration starup, (choose to install tools and agents from a predefined list, choose to enable these for the default chat.).

Web search feels very useful so I understand the tempation to add it as a builtin feature. But I can imagine other tools that could be argued be just as qualified to be builtins. And then you may endup with a confusing situation where some capabilities are builtins and others not. And what if I want another kind of web_search, that works a bit differently? If it was a tool then it would be easy to just fork the official tool and modify it.

What do you think?

Once again, I am very excited about this project and the pace that you have, but I want to avoid the "core" to become too bloated. I would like to encourage the development of tools, including 'custom web search'. I would argue that there is more potential in improving the agent/roles-ecosystem management (installing, updating, hacking, removing, sharing, combining).

sigoden commented 3 months ago

I have updated the description. The question 1, 2, 4 should be anwsered.

Regarding question 3 and 4, the answer is impossible; LLM functions require configuration, require a build process, require user attention and cannot be executed with a single command.

After analyzing numerous AI agent/workflow platforms, only three tools are essential: web_search, code_interpreter and draw (not used in CLI).

einarpersson commented 3 months ago

Great! I'm not sure if my comment helped you or if you were already thinking along these lines:)

I think there is still some naming confusion with the current proposal. Now there are tools and mapping_tools? The word mapping_tools is not clear to me. What is mapping_tool vs common tool? Do we really need to expand the vocabulary?

In your example above, is save_file a custom (?) tool? Shouldn't the user be able to add this to an agent definition as well? I understand that it may complicate things as the function declaration and implementation lives elsewhere but it feels... wrong.. to not be able to reuse tools between agents. Is there no way to solve it? Can't it just be a lookup (see if there is a function definition / implementation in the global tools folder, and error during agent startup if the tools are not available?)

I am also starting to feel that the boundaries of role vs agent is blurred. Could not a role just be an agent without any custom js/py/sh i.e. with only an index.yaml. Do we need a role if we were able to easily define a "simple agent"? What if the notion of a role did not exist, but only agents - and these could be ranging from very simple (corresponding to a role today) to more complex (agent) with both custom functions, RAG etc. Then it would be easier to transition from a starting point of a simple agent (role) and then add capabilities to it down the road.

config/aichat
config/aichat/config.yaml
config/aichat/agents/myagent1/index.yaml # including prompt etc, filling the purpose of a role today
config/aichat/agents/myagent2/index.yaml
config/aichat/agents/myagent2/functions.json
config/aichat/agents/myagent2/tools.js
config/aichat/agents/myagent2/rag.bin # gitignore if repo
config/aichat/agents/myagent2/sessions/foo.yaml # gitignore if repo
config/aichat/tools/calculator/index.json
config/aichat/tools/calculator/index.js
# or something...

((Regarding 3 I understand that it might have been asking for too much, I understand that it creates many follow up questions. I still think there could be a sweat spot somwhere but I'll leave that:) ))

Note: I really hope you only view my comment as positive feedback. I understand that you have thought more about these things than I have already, but still there can be some value in early user feedback and naive proposals:)

sigoden commented 3 months ago

Tool Fields:

The approach for agents to reuse tools has not yet been determined and will be discussed in another issue.

The relationship between roles and agents is outside the scope of this issue.