[Enhancement] Please integrate newest models

lightningRalf commented 1 year ago

Hello Paul,

could you integrate gpt3.5 16k and the new versions of gpt-4 and gpt-3.5-turbo? https://openai.com/blog/function-calling-and-other-api-updates

Will be interesting to test out GPT-4 with 8k context vs GPT-35 with 16k context.

And maybe you could utilize the functions also? Greetings

paul-gauthier commented 1 year ago

You should be able to try it with --model gpt-3.5-turbo-16k. I am about to do that myself.

And yes, I'll be investigating the functions too.

paul-gauthier commented 1 year ago

Sorry, you'll need to pull the latest build from GitHub for that to work. For now it just treats it like 3.5 with a bigger context window. So repo maps are disabled and it does not try and use a diff based output format. I'll be experimenting to see if either of those restrictions can be relaxed with this new model.

tmm1 commented 1 year ago

I'm playing with the new 3.5 16k model. I asked the LLM to create a new file, but aider keeps complaining that "Malformed ORIGINAL/UPDATE blocks, retrying..."

EDIT: I see the underlying error is "is not one of". I guess the 35 update code path doesn't handle new files correctly

paul-gauthier commented 1 year ago

Sorry to hear you're having troubles.

Are you able to paste some example output showing the problem? Often it's easiest to open up .aider.chat.history.md and pull the plain-text lines out of there.

paul-gauthier commented 1 year ago

Ah I see your reference to "is not one of" and yes, this looks like it is related to trying to create a new file. Aider doesn't currently support that for the 3.5 models.

A workaround is to just make an empty file and /add it to the chat first.

I will try and make this more automatic, similar to how aider handles file creation in gpt-4 chats.

tmm1 commented 1 year ago

Thanks for the workaround.

Could you also add support for this?

Unsupported model: gpt-4-32k-0613

Ideally these new model releases could be used directly without requiring stubs in the python code

tmm1 commented 1 year ago

I'm continuing to have "is not one of" issues. I asked for a modification for a specific file but the code block wasn't tagged in the parsable format.

Here's the updated content of `online/cmd/crawl.go`:

```go
package main

I wonder if it makes sense to integrate the new functions api instead, and offer the LLM functions to create and modify files, with json arguments that explicitly include a filename property.

paul-gauthier commented 1 year ago

Ya, support for gpt-3.5 models is experimental. It looks like you might have access to gpt-4? If so, it is much more reliable about following directions for including code/edits in its responses.

I will certainly be investigating how to use the new functions capabilities in aider! Both for 3.5 and 4.

Also, I updated aider to be more flexible about openai's models. It should now support any model that your key has access to, including gpt-4-32k-0613.

tomrobinsond commented 1 year ago

It would be good to refactor the code to use langchain so other models from different providers can more easily be integrated / experimented with in the future.

paul-gauthier commented 1 year ago

I'm continuing to have "is not one of" issues. I asked for a modification for a specific file but the code block wasn't tagged in the parsable format.

Hi @tmm1, I just released version 0.7.0, which has more robust support for GPT-3.5. Aider is now much more permissive at recognizing which files to edit when 3.5 returns malformed instructions (as it often does). I did some benchmarking, and this release of aider performs better at coding editing competency with 3.5 compared to the prior release. Let me know if you have a chance to try it out and if you see improvements.

I wonder if it makes sense to integrate the new functions api instead, and offer the LLM functions to create and modify files, with json arguments that explicitly include a filename property.

I also roughed in an coding backend for 3.5 which uses the new functions api. Surprisingly it benchmarks significantly worse than the current editing format that relies on markdown backtick fenced codeblocks. I'm going to continue to experiment here, and will hopefully share some quantitative learnings at some point.

paul-gauthier commented 1 year ago

Thanks for the feedback @tomrobinsond. I agree it would be nice for aider to support a larger variety of LLMs.

Other users have used the --openai-api-base argument to run aider against other models that can provide an OpenAI-compatible api. This looks like a relevant tool to serve many local models via a compatible api. I haven't tried it myself yet though.

https://github.com/go-skynet/LocalAI

One thing to keep in mind is that it generally requires some model-specific tuning to get prompts and editing formats working well. For example, GPT-3.5 and GPT-4 use very different prompts and editing formats in aider right now. So I imagine adopting new LLMs will require a similar effort to tailor the prompting and edit formats.

The latest version of aider includes a major refactor that makes it easy to develop, benchmark and manage a collection of many "coding backends". This should make it easier tailor aider to additional LLMs in the future.

evolu8 commented 1 year ago

Hi @paul-gauthier Did you look into LocalAI yet? It would be such a great direction to go. We see open-source models such as WizardCoder rocketing in performance. Being able to run locally will allow data-sensitive work to benefit, without as much data-breach risk to contend with (e.g. in medical domains).

mallefitzo commented 1 year ago

I have access to GPT4 32k but it still limits my tokens to 8k, can I select the model somewhere?

paul-gauthier commented 1 year ago

@mallefitzo thanks for reporting this problem. And congrats for having access to gpt-4-32k!

What output is telling you that aider is using gpt-4-32k but only treating it as an 8k context window?

Aider parses the context window size right out of the model name. So it should be assuming 32k.

mallefitzo commented 1 year ago

@mallefitzo thanks for reporting this problem. And congrats for having access to gpt-4-32k!

What output is telling you that aider is using gpt-4-32k but only treating it as an 8k context window?

Aider parses the context window size right out of the model name. So it should be assuming 32k.

@paul-gauthier this is the error I'm getting:

But as you can see, I have access to 32k:

paul-gauthier commented 1 year ago

Thanks for the screenshot @mallefitzo, super helpful.

Are you running aider with --model gpt-4-32k? What model does it report in the first few status lines?

$ aider --model gpt-4-32k
API key does not support gpt-4-32k, falling back to gpt-3.5-turbo-16k
Model: gpt-3.5-turbo-16k
Git repo: ../.git
Repo-map: disabled
Use /help to see in-chat commands.

Obviously my attempt to run with gpt-4-32k fails. But what status lines do you get?

mallefitzo commented 1 year ago

@paul-gauthier ahhh! It shows me this now (I assumed, it would pick the highest model automatically).

So I suppose, this will enable the 32k tokens?

Thanks for your help!

paul-gauthier commented 1 year ago

Glad to hear you have it working now @mallefitzo

Aider defaults to plain old gpt-4. If that isn't available it falls back to gpt-3.5-turbo-16k.

I didn't want to automatically select gpt-4-32k because of the cost. I figured it would be safer to let folks choose it explicitly.

paul-gauthier commented 1 year ago

Also @mallefitzo you can use the /tokens command inside the chat to verify that aider thinks you have a 32k context windows.

mallefitzo commented 1 year ago

Amazing, thank you!

paul-gauthier commented 1 year ago

I'm going to close this issue since aider seems to be properly supporting all the recently released OpenAI models. Please feel free to reopen or start a new issue if you are having problems.

@tmm1 you were asking about supporting the new functions api. I did some benchmarking, and using functions actually seems to make code editing less reliable with both GPT-3.5 and GPT-4. I wrote up some notes here:

https://aider.chat/docs/benchmarks.html

paul-gauthier / aider

[Enhancement] Please integrate newest models #20