Check out my fork maybe

joaotavora commented 9 months ago

Hi. I wanted to try out gptel after watching your excellent video presentation, but I don't have an OpenAI account, and local LLMs are slow (in my extremely newbie opinion).

So I signed up for a TogetherAI free trial and made it work with gptel rather easily. There's a commit in my fork adding support for this, https://github.com/karthink/gptel/commit/4feedb7fb6258f937b89a4346a683314f49e29f2.

And a couple more things you may find useful as I go through your Elisp. You can cherry-pick what you want, I'll try to make the commits self-contained and descriptive in their commit messages.

joaotavora commented 9 months ago

o I signed up for a TogetherAI free trial and made it work with gptel rather easily. There's a commit in my fork adding support for

Make that https://github.com/karthink/gptel/commit/8e2d4bfb78f0adcae44f9b45225cb167bb1e065b instead. That commit had a bug so I push-forced. Better just visit the fork, as I might be force-pushing frequently

karthink commented 9 months ago

Oh wow, thank you @joaotavora, I'll keep an eye on your fork! I appreciate getting feedback from an elisp expert. gptel grew rather organically so the code quality isn't great.

joaotavora commented 9 months ago

I appreciate getting feedback from an elisp expert.

I'm very flattered, but I do my share of blunders every day

gptel grew rather organically so the code quality isn't great.

It grew very very well. I had already been impressed by your popon.el library some years ago. But this one is just fantastic, it has the perfect philosophy of minimalism and integration with Emacs's principles (or at least the principles that I like in Emacs :-) )

I'm going over your video presentation to try to replicate those things.

The code quality isn't bad at all. Sure some things could probably be simpler, but it depends if they stand in the way of your plans for the future.

Maybe I would use defclass instead of defstruct. Then again maybe not, and it's fine. Keep implementation details hidden with -- (as you do) and you give yourself room to reimplement without headches. I've fixed that cl-find-method thing, I think.

joaotavora commented 9 months ago

(Where can I get your "dictionary" and "CLI commands" directives?)

karthink commented 9 months ago

(Where can I get your "dictionary" and "CLI commands" directives?)

I added these to my Emacs configuration -- the directives included with the package are intentionally generic and unopinionated.

Here they are:

(setq gptel-directives
        '((default . "To assist:  Be terse.  Do not offer unprompted advice or clarifications. Speak in specific,
 topic relevant terminology. Do NOT hedge or qualify. Do not waffle. Speak
 directly and be willing to make creative guesses. Explain your reasoning. if you
 don’t know, say you don’t know.

 Remain neutral on all topics. Be willing to reference less reputable sources for
 ideas.

 Never apologize.  Ask questions when unsure.")
          (programmer . "You are a careful programmer.  Provide code and only code as output without any additional text, prompt or note.")
          (cliwhiz . "You are a command line helper.  Generate command line commands that do what is requested, without any additional description or explanation.  Generate ONLY the command, I will edit it myself before running.")
          (emacser . "You are an Emacs maven.  Reply only with the most appropriate built-in Emacs command for the task I specify.  Do NOT generate any additional description or explanation.")
          (explain . "Explain what this code does to a novice programmer.")))

joaotavora commented 9 months ago

Fantastic! Thanks so much!

Another question (if you have time). When I want to pair-program with a model, I should probably first send the whole file for context right? But then when I want the model to, say, rewrite a specific function I'm having trouble with, will it know about the context I told it about earlier?

By the way, I'm pushing more commits into the fork. Not sure they're all great (there's this "model sanitization" thing, that I don't know if is the correct approach).

karthink commented 9 months ago

But this one is just fantastic, it has the perfect philosophy of minimalism and integration with Emacs's principles (or at least the principles that I like in Emacs :-) )

I think all long-time Emacs users tend to converge to these preferences/principles :)

The code quality isn't bad at all. Sure some things could probably be simpler, but it depends if they stand in the way of your plans for the future.

Thank you. There is no grand plan for gptel beyond providing a low-friction, high-flexibility Emacs interface to LLMs. (More generally, I'm hoping we will have small/efficient free local LLMs in the near future -- I don't like sending data to OpenAI or the SaaS model!)

Fantastic! Thanks so much!

There are some large, detailed prompts collected by Greg Grubbs here that you might also be interested in. I haven't had a chance to try them out yet.

I should probably first send the whole file for context right? But then when I want the model to, say, rewrite a specific function I'm having trouble with, will it know about the context I told it about earlier?

Providing context is currently an issue with all LLM interaction. The (mostly) stateless API means every request should contain all the context. This means resending the file(s) with each request! IIUC this is not just a limitation of the REST API, but a general issue with LLMs' limited context windows. All mitigations of this problem center on identifying the best minimal subset of data that needs to be sent with each request:

There is some ongoing work on trying to provide context efficiently: see Retrieval Augmented Generation
There are clever approaches by app developers where only the project structure is sent, along with relevant functions and function signatures that are identified from the query context. (See aider)
Microsoft's Github Copilot plugin for VSCode does something similar, computing (locally) a Jaccard similarity score across the contents of all open buffers + a logistic regression model to identify what context needs to be sent with each request.

In GPTel (and most other Emacs LLM packages) the only way to do it right now is to include all context in a single buffer. This means runninginsert-file in a "chat" buffer and continuing to talk to the LLM in that buffer. There is an open issue (#176) to add the option to (automatically) include specific files/buffers with each request. This will probably live in a "Set Context" submenu in gptel-menu. (And the header-line will warn the user that they are sending entire files with every query.)

Doing something cleverer and identifying the right subset of a code project to send (by scanning the etags file, or with treesitter queries etc) is pretty much out of scope for gptel right now.

By the way, I'm pushing more commits into the fork. Not sure they're all great (there's this "model sanitization" thing, that I don't know if is the correct approach).

I checked out a couple of them so far -- the fixes are much appreciated! Thanks for making them mostly independent, I'll apply them carefully when I have time.

karthink commented 9 months ago

(One exception to the stateless APIs: The Ollama API returns a vector (of numbers) along with each response that represents some kind of embedding (state) of the conversation so far. Including this vector with the next query has the effect of continuing the conversation. The model is still "stateless", but essentially a growing context vector is sent back and forth with each request. This is less data than sending the entire conversation with each request, but it also appears to be more lossy.)

EugeneNeuron commented 7 months ago

There are clever approaches by app developers where only the project structure is sent, along with relevant functions and function signatures that are identified from the query context. (See aider)

Thanks for the reference to aider! I recently thought about how to provide only relevant context to LLM and leaned toward references of thing at point, imported modules in current file etc. that lsp provides, but aider devs already did a great job fetching the most relevant portions of the repo map with a tree-sitter and a graph ranking algorithm. Though, I'm not sure if tree-sitter provides up-to-date information about symbols, changed during the current session. But that's another question. Thanks!

karthink commented 7 months ago

@joaotavora I added all your commits to gptel except for this one, that fixes how I use a generic function for parsing the response.

My weird code here is intentional: the dispatch can occur thousands of times when parsing a long enough response, and it measurably reduced Emacs' responsiveness in this situation when I did it the straightforward way.

All the argument types of the parser function are known ahead of time, when the request is sent, so I find the function that will be called ahead of time to avoid the repeated dispatch. There's probably a better way to do it but I couldn't figure it out -- I'm open to suggestions.

Thanks a lot for your commits and suggestions!

joaotavora commented 7 months ago

All the argument types of the parser function are known ahead of time, when the request is sent, so I find the function that will be called ahead of time to avoid the repeated dispatch. There's probably a better way to do it but I couldn't figure it out -- I'm open to suggestions.

Sure. I'm afraid I'm not using gptel anymore, but if you provide some kind of reproducible benchmark showing that slowdown with the generic function, it would probably help. THough I'm not hacking Emacs anymore, I think @monnier probably has some insight how why cl-generic's dispatch is so slow for you. My naive suggestion would be to not use a generic there and do the dispatch by hand.

monnier commented 7 months ago

IME (mostly with cl-print) cl-generic's dispatch is costly because for every argument on which it has to dispatch (i.e. not all methods use the t specializer) it does at least one &rest/apply pair, which allocates a list (and is less optimized than funcall). If you do enough work within the dispatched method, this is OK, but otherwise, in a tight loop the allocation causes the GC time to dominate. [ And if your method uses cl-call-next-method, it's worse. ]

karthink / gptel

Check out my fork maybe #184