Open haplo opened 7 months ago
Hi @haplo I like your idea. Have you signed FSF papers?
I like your idea.
Great! I will try to find some time to work on it and send a PR.
Have you signed FSF papers?
Thanks for reminding me, because I have them ready to print, sign, scan and send and keep postponing it. Will do it tomorrow.
This is not very focused and it could be considered a frame challenge but it is meant to be constructive.
I have this same desire to have a different model for code operations than for more general tasks, so when I saw this I immediately liked it and have been waiting with baited breath ever since. But as I waited I started thinking, "Maybe I need two code models: one for pure generation and one (probably with a name ending in "instruct") for code review, commit message generation, and for just asking advice".
And the more I thought about it the more possible use cases I starting thinking of, and I'm sure we don't want to end up defining a dozen different specialist provider type in ellama, but what is the alternative?
Maybe requiring clear evidence before considering new provider types?
Maybe tell users to define their local selection in their ellama-providers
list and then install hooks to swap the currently selected provider(s) as they move in and out of various modes? Is that more emacsy? Right now this is my favorite notion, and if I get some time in the next few days I'll try that locally and report back.
Something obvious that I'm overlooking?
Aside: Not really pertinent to this discussion, but I've also been wondering about the resource demand of all those model if (like me) you expect to run them locally. Or even on a server on a private network, which is what I'm considering at work. Maybe we're better trying to have a single "as big we we can afford" generalist model rather than swapping out lot of smaller specialized specialized ones? If you go that route you'll need to consider the prompting in a little more depth, but it is an option.
For code completion much better will be https://github.com/ahyatt/llm/issues/45 use another api with FIM support.
@dwmckee about your thoughts - I will do my best to make it useful. I think multiple provider options is a good solution. If it's not set we can fallback to default provider. We also can provide something like "for this commands use this provider", but I don't know yet how interface should look like, maybe custom variable with alist or something like that.
I said I'd try switching providers on mode changing events, and I have gotten around to a bare minimum example, which changes my "chat" provider in programming buffers. Perhaps it's a model for temporary relief while waiting for this feature.
Please excuse my naive eliisp. Or offer suggestions if you want to teach me something.
;; Define the providers we like
(setq-default ellama-providers
(list
(cons "chat-semantic" (make-llm-ollama
:chat-model "gemma2"
:embedding-model "gnomic-embed-text"))
(cons "chat-literal" (make-llm-ollama
:chat-model "gemma2"
:embedding-model "gemma2"))
(cons "code-completion" (make-llm-ollama
:chat-model "codegemma"
:embedding-model "codegemma"))
(cons "translation" (make-llm-ollama
:chat-model "aya"
:embedding-model "aya"))))
;; Now set specific providers from the list by name
(setq-default ellama-provider (alist-get "chat-semantic" ellama-providers
nil nil 'string-equal))
; Code completion should go in
; here once supported
(setq-default ellama-naming-provider (alist-get "chat-literal" ellama-providers
nil nil 'string-equal))
(setq-default ellama-translation-provider (alist-get "translation" ellama-providers
nil nil 'string-equal))
;; Convenience functions for mode change events
(defun my-select-code-completion ()
(setq ellama-provider (alist-get "code-completion"
ellama-providers nil nil 'string-equal)))
(defun my-select-chat ()
(setq ellama-provider (alist-get "chat-semantic"
ellama-providers nil nil 'string-equal)))
;; This ought to use :hook, but start by getting it working at all
(add-hook 'prog-mode-hook (lambda () 'my-select-code-completion))
(add-hook 'text-mode-hook (lambda () 'my-select-chat))
(I've actually put all this inside a use-package :init
block, but it should translate to un-packaged setups as well)
At this point all I've tested is that it changes the provider as I move from programming buffers to text ones and back. Because I don't know how other common buffers are categorized (org-mode? magit-commit?, your favorite?) I"m sure it's incomplete.
Will write more after I've worked with it for a while.
I usually want to use different models for coding and chat/summarize, but I don't know a better way than switching providers. I'm pretty new to ellama so I might be missing some key piece of the workflow.
I would like to have an
ellama-code-provider
configuration option that when set will be used by all theellama-code-
commands. To keep backwards compatibility it can default tonil
, in which case the commands will keep usingellama-provider
.Even better might be to have a mapping between Emacs mode and provider, so the user can specify different models for different programming languages. This makes sense as some models are highly specialized, e.g. for SQL or Python. I would rather keep it simple for now though.
If nobody else wants to work on this I can give it a try, I would just like to know this is the right approach.