CodingTrain / Bizarro-Devin

12 stars 4 forks source link

Testing with Ollama #49

Closed shiffman closed 6 months ago

shiffman commented 6 months ago

The consumeStream() method doesn't work with ollama due to a slightly different format so I've adpated the code. I am not having the [INST] issue with ollama, however running 70b-chat on my M1 laptop seems to tax it quite a bit (longer latency than replicate streaming and fan going crazy.)

dipamsen commented 6 months ago

In ollama, the message history collection is not implemented for streaming. Without adding it, the message history won't be populated and the llm won't have proper history context.

dipamsen commented 6 months ago

I have unified the API response types from ollama and replicate, so that the consumer does not have to differentiate between them. Also added message history collection for ollama. (Untested)

Unrelated to this, I also have added a new command manualPrompt to prompt the AI by typing the message (as an alternative to voice input, for testing purposes).

shiffman commented 6 months ago

I'm going to merge this to keep ollama on track as an alternative back-end, if anyone wants to try hooking this up to either GPT4 or Gemini, I'm happy to provide API keys to see how these models perform in comparison to llama!!