ozdemir08 / youtube-video-summarizer

MIT License
28 stars 5 forks source link

✨ feat(settings): add local llm support #12

Open rshmhrj opened 5 months ago

rshmhrj commented 5 months ago

@ozdemir08 took a first stab at getting a local LLM working

TODO: still need to test 'request' and parsing from ollama -- right now computer is extremely slow and hanging when I try to call it locally

Local LLM not toggled:

image

Local LLM toggled on:

image

setup steps:

Here's the current result in Obsidian: image

Here's the result with the Local LLM off using 3.5-turbo (much nicer, IMO): image

ozdemir08 commented 5 months ago

Hi Rishi, thanks for looking into this and already making progress. I have a few high-level feedback:

Running a local LLM

First and foremost, I believe we should leave the logic of running a local LLM to an external library like ollama-js as opposed to the way you suggest below.

ollama pull llama2
tested LLM works via ollama run llama2
if local server not running, start with ollama serve
curled test request:
... 
...

The ollama-js library must have solved many problems including running it locally, downloading the models, managing the lifecycle of it etc and it's better to focus on the actual problem instead of introducing a technical problem and solving it.

Settings page

I can see us adding Gemini or another LLM support soon. So, I think we should not have a toggle that says "Enable local LLM", but instead a dropdown that allows selecting between OpenAI - Chat GPT / Meta - Llama.

The settings page will have model-agnostic settings like summary size, date etc. and model-specific settings like OpenAI key, ChatGPT model etc. For the Llama case, I envision the settings like this: It will have a dropdown for models and then there will be a button next to it, that says "Download model". As models are large objects, users should download them intentionally.

Another thing is that I have an M1 Macbook Air, which is a relatively new laptop and when I use Lllama, it often freezes it. That's why, I am concerned that it may not be as effective on many laptops . Also, mobile phones may have similar issues. Even if they run smoothly, they may heat up the battery etc. That's why I would still keep this as an experimental feature and not enable it by default.

Implementation

Due to similar reasons, it's important to write the code in a way that the AIClient interface has multiple implementations(in this case only 2) and we do not rely on a boolean like enable local llm, but instead use an enum and initialize the interface with the right implementation in YoutubeVideoSummaryModal.

Testing

Initially, I had not implemented any tests as the plugin was small and I prioritized speed and shipping over technical excellence. As the plugin code grows, we will need to start adding tests. No need to do anything now, but this is something for us to consider later.

Your contributions

You are the first contributor of the plugin and here you are working on your second PR. I must say that this is amazing to see for me. If you would like, I can add you as a core contributor/owner of the plugin. If not, I can add your name in the readme and give you credit. Let me know, which one you would prefer :)