karthink / gptel

A simple LLM client for Emacs
GNU General Public License v3.0
1.06k stars 116 forks source link

Add support for function calling handling #76

Open colonelpanic8 opened 1 year ago

colonelpanic8 commented 1 year ago

See https://platform.openai.com/docs/guides/gpt/function-calling

would be cool to somehow add some support for this. Some inversion of control style kind of thing where you could set up the system prompt to be able to call emacs functions or something like that might be cool.

karthink commented 1 year ago

Maybe I'm lacking imagination -- could you provide an example of the kinds of things you could do in Emacs with this? (Keeping in mind that this requires sending openAI the full type signatures of all the available functions.)

colonelpanic8 commented 1 year ago

The main use case I have in mind is sort of cleaner manipulation of existing buffer content.

Right now, it can occasionally be kind of difficult to get chat gpts response to be something that is just a code completion, even if you prompt it with stuff like:

"You are a large language model and a careful programmer. Provide code that completes what is provided and only code as output without any additional text, prompt or note."

it still does not always only give the completion.

Furthermore, it could do pretty interesting stuff on larger segments of code, given the proper api, that could avoid it having to send back the entire text.

It might be possible to do something like provide it with editing commands that allow it to make edits to specific parts of a buffer or a selected region.

Some thinking and testing around what exactly the api should look like here. Examples of what the API could look like:

This is something that I think will become increasingly important as the context window is made larger and it can operate over larger and large segments of code bases.

As the context window increases, we could even consider sending over entire projects and then having the edit api specify edits in specific files.

karthink commented 1 year ago

IIUC, you're describing the case of supplying ChatGPT the descriptions and type signatures of elisp functions, and then asking it to combine those (along with its general knowledge of elisp) to make edits to the buffer. I'm still unable to think of an actual use case in this context, since it already understands the elisp API primitives quite well. The fact that buffer editing elisp functions have to work by side-effect doesn't help either. Can you think of a more specific example like the weather report function in the documentation?

colonelpanic8 commented 1 year ago

IIUC, you're describing the case of supplying ChatGPT the descriptions and type signatures of elisp functions, and then asking it to combine those (along with its general knowledge of elisp) to make edits to the buffer.

Nope.

Did you read the text I wrote above? I don't think there would be any expectation that it directly call elisp arbitrary elisp functions. You would probably give it a relatively limited API that would simply allow it to specify edits to existing code:

As an example, you might give chat gpt functions like:

delete(start_line: int, start_character: int, end_line: int, end_character: int) insert(start_line: int, start_character: int, text: string)

You would then handle these requests from chat gpt internally with elisp code.

The advantage would be that:

a) You could differentiate between cases where chat gpt wants to simply respond to the user, perhaps to ask for more information, or give some exposition, and cases where chat gpt wants to make edits to the code that was provided b) You can make responses that involve refactorings, or just changing parts of an existing function/class etc. much more efficient and ergonomic (no more going back and forth and selectively copying certain parts)

In the current status quo, you can SOMETIMES get chat gpt to complete things, if you prompt it just right, but often it will give extra exposition that you have to delete or otherwise modify. Also, you're never going to get it to make inline edits to a function to change things working through gptel.

With my suggestion, it seems quite plausible to me that you could have it perform complicated transformations of an input text.

colonelpanic8 commented 1 year ago

and then asking it to combine those (along with its general knowledge of elisp) to make edits to the buffer. I'm still unable to think of an actual use case in this context

You can't think of an actual use case?

As an example, here's an example of a problem that gpt-4 can easily one shot:

https://chat.openai.com/share/e3a05c2e-7dd4-4e97-bc3d-fd6decd49285

In an ideal world, gptel could handle making the necessary edits, and I think using the functions api is a good way to do this.

If nothing else, v1 version of this could literally only allow complete replacements of the existing text.

It would still be useful for it to be able to specify:

Here is the start and end of the existing text that this new text should replace, and the functions API seems like a natural way to be able to exchange that information.

PalaceChan commented 1 year ago

Hmmm that API is basically exposing another argument (like "model name" or "temperature") in which one can optionally convey an array of "function declarations (with optional docstrings)". When i saw this post the thought i had was: maybe one wants to refactor a very large function or piece of code, currently that requires getting the full thing rewritten to then do an ediff (possible from the refactor transient) whereas it might be the case that the actual diff needed is tiny and perhaps this way GPT can convey it better via some funcalls and perhaps one could execute it straight from the reply to effect the change....but with @IvanMalison latest reply it seems one doesn't have to actually give it actual elisp functions but instead a small set of simple "made up" text manip primitives which one can handle in the callback with actual elisp? (safer and potentially more promising)

(Although TBF, in the above case i would, in a gptel buffer, just ask it to reply with only a diff of what to change which id then try to apply by using the diff-mode bindings to "apply" which might be very easy if one uses org-edit-special on GPT's diff-mode reply)

since gptel supports custom callbacks via the lower level function gptel-request that API can already be experimented with? The only (minor) missing piece would be a way to set that parameter in the curl...to tinker with the idea one can just defadvice override the curl args function to additionally hardcode in a "functions" argument?

Then it'd be interesting to see some real use cases demoed via gptel-request of something where this API lets one do something much better than currently possible

colonelpanic8 commented 1 year ago

@PalaceChan do you have API access? I haven't gotten it yet and I've been on the wait-list for months. I would totally start hacking on this if I did.

PalaceChan commented 1 year ago

@IvanMalison i am also on the wait-list but that only applies to gpt4, not to gpt-3.5-turbo-0613

run this command to see which models you have

curl --location --silent --compressed --disable -D- -H"Authorization: Bearer $OPENAI_KEY" https://api.openai.com/v1/models

in my case i dont see any of the gpt-4 :( but i do see gpt-3.5-turbo-0613

then try this command, for example:

curl --location --silent --compressed --disable -D- -d'{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"what is the weather like in Boston?"}],"stream":true,"temperature":1.0}' -H"Content-Type: application/json" -H"Authorization: Bearer $OPENAI_KEY" https://api.openai.com/v1/chat/completions

(the model replied saying it cannot do that because it doesnt have weather check access)

now try running this

curl --location --silent --compressed --disable -D- -d'{"model":"gpt-3.5-turbo-0613","messages":[{"role":"user","content":"what is the weather like in Boston?"}],"functions": [{"name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state, e.g. San Francisco, CA" }, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]} }, "required": ["location"]}}],"stream":true,"temperature":1.0}' -H"Content-Type: application/json" -H"Authorization: Bearer $OPENAI_KEY" https://api.openai.com/v1/chat/completions

this time the response I got used the function API, trimming the fat out from my *Async Shell Command* buffer:

,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"role":"assistant","content":null,"function_call":{"name":"get_current_weather","arguments":""}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"{\n"}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" "}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" \""}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"location"}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"\":"}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" \""}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"Boston"}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":","}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":" MA"}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"\"\n"}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{"function_call":{"arguments":"}"}},"finish_reason":null}]}
,"model":"gpt-3.5-turbo-0613","choices":[{"index":0,"delta":{},"finish_reason":"function_call"}]}

(very very minor side-note unrelated to the functions API but @karthink i remember reading that -X POST in curl isnt recommended as it is implied by sending data and more brittle as could break forwarding. The header "Content-Type: application/json" part is also redundant if instead of using the -d flag one just uses the --json flag - The flag seems to have been around since 7.55.0 released august 2017)

Here is a simple script i was using to explore the suggestion, following their example guide

import json
import requests

GPT_MODEL = "gpt-3.5-turbo-0613"
API_KEY = YOUR_API_KEY

functions = [
    {
        "name": "delete_text",
        "description": "Deletes text contained between start_line line and start_col column and ending on end_line line and end_col column. Both the line number range and the column range are inclusive and zero indexed",
        "parameters": {
            "type": "object",
            "properties": {
                "start_line": {
                    "type": "integer",
                    "description": "the line number where the region to delete starts (zero indexed).",
                },
                "start_col": {
                    "type": "integer",
                    "description": "the column number where the region to delete starts (zero indexed).",
                },
                "end_line": {
                    "type": "integer",
                    "description": "the line number where the region to delete ends (zero indexed).",
                },
                "end_col": {
                    "type": "integer",
                    "description": "the column number where the region to delete ends (zero indexed).",
                },
            },
            "required": ["start_line", "start_col", "end_line", "end_col"],
        },
    },
    {
        "name": "insert_text",
        "description": "Inserts text at line start_line and column start_col. Both parameters are zero indexed",
        "parameters": {
            "type": "object",
            "properties": {
                "start_line": {
                    "type": "integer",
                    "description": "the line number where the text should be inserted.",
                },
                "start_col": {
                    "type": "integer",
                    "description": "the column number where the text should be inserted.",
                },
                "text": {
                    "type": "string",
                    "description": "the literal text to be insert",
                },
            },
            "required": ["start_line", "start_col", "text"],
        },
    },
]

def send(msgs, *, functions=None, function_call=None, model=GPT_MODEL):
    headers = {"Content-Type": "application/json", "Authorization": f"Bearer {API_KEY}"}
    data = {"model": model, "messages": msgs}
    if functions is not None:
        data["functions"] = functions
    if function_call is not None:
        data["function_call"] = function_call
    try:
        response = requests.post("https://api.openai.com/v1/chat/completions", headers=headers, json=data)
        return response
    except Exception as e:
        print(f"Exception: {e}")

def print_msgs(msgs):
    for msg in msgs:
        if msg["role"] == "system":
            print("SYSTEM:\n", json.dumps(msg["content"], indent=2),"\n")
        elif msg["role"] == "user":
            print("USER:\n", json.dumps(msg["content"], indent=2),"\n")
        elif msg["role"] == "assistant" and msg.get("function_call"):
            print("ASSISTANT:\n", json.dumps(msg["function_call"], indent=2),"\n")
        elif msg["role"] == "assistant" and not msg.get("function_call"):
            print("ASSISTANT:\n", json.dumps(msg["content"], indent=2),"\n")
        elif msg["role"] == "function":
            print(f"FUNCTION msg['name']:\n", json.dumps(msg["content"], indent=2),"\n")

defun = r"""
(defun fun (args)
  "Prints `args` to the mini-buffer"
  (message "%s" args))
"""

msgs = []
msgs.append({"role": "system", "content": "You are a programmer's assistant, living in Emacs. Respond concisely."})
msgs.append({"role": "user", "content": f"Consider this function: {defun}. Refactor it so that it is interactive, sets its argument from the region (if it is active) or the current word at point otherwise. Remember to update its documentation as well."})
response = send(msgs, functions=functions)
if response is not None:
    msgs.append(response.json()["choices"][0]["message"])
print_msgs(msgs)

msgs.append({"role": "user", "content": f"Why did you only call delete_text?"})
response = send(msgs, functions=functions)
if response is not None:
    msgs.append(response.json()["choices"][0]["message"])

The first time through I did not pass any functions parameter and I got this sensible interaction:

SYSTEM:
 "You are a programmer's assistant, living in Emacs. Respond concisely." 

USER:
 "Consider this function: \n(defun fun (args)\n  \"Prints `args` to the mini-buffer\"\n  (message \"%s\" args))\n. Refactor it so that it is interactive, sets its argument from the region (if it is active) or the current word at point otherwise. Remember to update its documentation as well." 

ASSISTANT:
 "Here's the refactored code:\n\n```emacs-lisp\n(defun fun (arg)\n  \"Prints `arg` to the mini-buffer\"\n  (interactive (list (if (region-active-p)\n                         (buffer-substring-no-properties (region-beginning) (region-end))\n                       (current-word))))\n  (message \"%s\" arg))\n```\n\nThe `interactive` declaration sets `arg` interactively based on whether a region is active or it uses the current word at point. The documentation has also been updated to reflect this change."

Then I passed it the functions parameter but as you can tell from the script I got kinda rubbish:

SYSTEM:
 "You are a programmer's assistant, living in Emacs. Respond concisely." 

USER:
 "Consider this function: \n(defun fun (args)\n  \"Prints `args` to the mini-buffer\"\n  (message \"%s\" args))\n. Refactor it so that it is interactive, sets its argument from the region (if it is active) or the current word at point otherwise. Remember to update its documentation as well." 

ASSISTANT:
 {
  "name": "delete_text",
  "arguments": "{\n\"start_line\": 0,\n\"start_col\": 0,\n\"end_line\": 5,\n\"end_col\": 0\n}"
} 

USER:
 "Why did you only call delete_text?" 

ASSISTANT:
 {
  "name": "insert_text",
  "arguments": "{\n\"start_line\": 0,\n\"start_col\": 0,\n\"text\": \";; \"\n}"
} 

Probably have to play around with this some more...

colonelpanic8 commented 1 year ago

@PalaceChan

ah very cool, nice work.

Perhaps we shuold start with an even simpler API that doesn't have two functions.

That said, the fact that this is a 3.5 variant doesn't give me tons of hope.

PalaceChan commented 1 year ago

@IvanMalison even though i'm personally still on the waitlist I was able to get help running this with gpt-4-32k-0613. For the original example it still replied with the "delete_text" only.

Then i changed the example to a buggy python function

def sort_and_filter(strings, comparator, keep, discard):
    kept_strings = []

    # Keep elements
    for s in strings:
        if re.search(keep, s):
            kept_strings.append(s)

    #Discard elements
     filtered_strings = []
     for i in range(100):
         if not re.search(discard, kept_strings[i]):
             filtered_strings.append(kept_strings[i])

     # Sort elements
     sorted_strings = sorted(filtered_strings, key=comparator)

     return sorted_strings

running that example with the two-function API and gpt4-32k-0613 and asking it to fix the bug called delete_text with the right range to wipe out the middle segment but it did nothing else so boo 👎

i then changed the functions array to have a single function which is the same as the delete_text one except called replace_text taking a fifth argument of what to replace the region with.

this time it replied by calling replace_text with arguments start_line 13, start_col 11, end_line 15, end_col 47, and new_text "for s in kept_strings:\n if not re.search(discard, s):\nfiltered_strings.append(s)"

the region seems wrong as it clips the for loop part in the middle but slightly cooler to see it trying..

nevertheless still much less "useful" than when i simply asked it that in the chat buffer and asked it to format its reply in the form of a diff hunk, here one gets a lovely org src block in diff mode containing only a correct bugfix hunk (a single line diff that changes range(100) to (range(len(kept_strings))

isaacphi commented 5 months ago

I put together a proof of concept for this: https://github.com/karthink/gptel/pull/209