Open R-udren opened 2 weeks ago
Are you thinking something like sending a request to a ollama local host endpoint?
Absolutely. Also, litellm is an option. As I saw code, this isn't really hard task.
Yes it shouldn't be that hard, I'll try it out and see what model can handle the job with reasonable latency.
I've tried using llama 8B locally, it's doable but the latency can easily go up to 10s+ for a simple task. Not sure if that will be a pleasant UX. But I'll still work on adding an option to input port and model name for local models.
I will try it on my hardware
I've just merged a PR #79 with local model settings. I've tested it with ollama. Not sure how much settings is needed for different endpoints.
I found a few issues with the extension:
Manifest File:
In youtube-addiction-rehab-chrome-extension\chrome-extension\manifest.js
on line 24, replace http://localhost*
with http://localhost/*
. It raises this error:
Local Models Dropdown:
The extension should list all available local models in the "Choose a model" dropdown. List local models: http://localhost:11434/api/tags
Popup Error Message:
Sometimes, the following error message appears in the popup while using a local model:
⬅️ Please set up your API key in the settings.
Model Performance:
The model typically takes about 2-5 seconds to respond. When using tinyllama
instead of llama3.1:8b
, tinyllama
generates Python code for no reason 🤣.
System Prompt:
Properties like id
and reason
should be enclosed in double quotes. But in system prompt they aren't.
My Suggestions:
If the extension cannot parse JSON properly, it should retry the request up to three times. If it still fails to receive a valid response, it should fall back to the default Youtube recommendations.
The "reason"
field may not be necessary, as it can increase model response time.
Thank you for your contribution. We will check and reply to you as soon as possible.