sigoden / aichat

All-in-one AI CLI tool featuring Chat-REPL, Shell Assistant, RAG, AI tools & agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
Apache License 2.0
3.73k stars 248 forks source link

[Feature Request] Function calling #507

Closed gilcu3 closed 3 months ago

gilcu3 commented 3 months ago

Do you plan supporting the function calling feature? Would you be open to accept PRs on that regard?

Currently, some functions such as DuckDuckGo search or Wolfram Alpha could greatly extend the model's capabilities.

sigoden commented 3 months ago

function call is for developers, not end users.

Currently the only promising form of extending model functionality is gpts. Technically, it is possible to implement a command line version of gpt through function call, but at this stage, there are neither users nor suppliers, so there is no chance.

gilcu3 commented 3 months ago

I am not you what do you mean by only for developers. Certainly I can use both I mentioned as a user, because the model directly decides which function to call during a standard conversation.

cboecking commented 3 months ago

It seems to me that the best use of function calling is to make the call via cli and pipe in the results to aichat. One use case that I hope evolves is the ability to use aichat as a remote control over an ollama-type system with enhanced tools capabilities.

Is this in line with your comment @gilcu3 ?

gilcu3 commented 3 months ago

@cboecking That is one possibility, although the usage I was suggesting is much simpler. When interacting with the bot it might realize it needs to search something online, or solve some math formula, and in such cases it automatically uses function calling to reply to the user. In the background aichat would call (via cli or using a library) the corresponding functions specified by the AI.

tkanarsky commented 3 months ago

tools are something you pass to the base model in a structured form, e.g. for OpenAI https://platform.openai.com/docs/api-reference/chat/create#chat-create-tools, and the API handles whether to return a chat message or request a tool invocation. it would make the most sense to allow the user to define tools and their interfaces in config.yaml, specify if the current model supports tool usage (which most OpenAI compatible API wrappers support, e.g. Huggingface TGI: https://huggingface.co/docs/text-generation-inference/en/basic_tutorials/using_guidance#chat-completion-with-tools), and if the completion requests a tool call then invoke the tool after prompting the user.

example config.yaml, in pseudocode:

- tools:
  list_directory: 
    shell-command: `ls %input`
    interface: {"input": "Path of directory to list"}
  convert:
    shell-command: `convert %source %destination`
    interface: {"source": "Source image name", "destination": "Target image name"}
- models:
  openai:gpt-4o:
    features:vision,tool-usage

in a chat:

temp) Convert image in current dir to jpg

*Run tool `list_directory .`?* (y/N): y
Output of `list_directory .`: foo.tiff doc.txt

Thanks for providing the file names. foo.tiff appears to be the image you're interested in.

*Run tool `convert foo.tiff foo.jpg`?* (y/N): y
Output of `convert foo.tiff foo.jpg`: 

I have obtained the filename of the image in the current directory and converted it to jpg as per your request.

temp)

and re: users vs developers, in many senses users of aichat are sort of developers too!

sigoden commented 3 months ago

514 has implemented this function, you can try it. Welcome feedback and suggestions.

gilcu3 commented 3 months ago

514 has implemented this function, you can try it. Welcome feedback and suggestions.

Feature working perfectly, thank you! Will you accept PRs to add more example tools to llm-functions? I tested specifically with Wolfram Alpha factoring a big number...

sigoden commented 3 months ago

I welcome you to submit a PR @gilcu3

sigoden commented 3 months ago

There are two function types:

  1. Dispatch Functions: Execute a script.
  2. Retrieve Functions: Generate JSON data for further LLM processing.

Now the question is how to distinguish them?

What are your suggestions?

gilcu3 commented 3 months ago

I would go for either of the first 2 options, leaning towards the second. For the moment I cannot imagine any other type of function. For retrieve functions then it would make sense to keep their inputs and outputs as part of the session conversation.

sigoden commented 3 months ago

AIChat now fully supports the function calling feature.