Open chandeldivyam opened 2 months ago
Thanks for letting know about that I understand the issue. It's indeed take relatively a lot of ram.
That thing is actually a feature, not a bug ;). I added caching mechanism to the model for low end computers since it's heavy for them to load the model. In addition when transcribing batch files (when adding multiple) then it's faster that way.
Maybe we can add option in the settings that will release it from cache once it's finished. But then it will release also in batch transcribe so need to see how to still keep it in caching in batch transcribe. Maybe we can just add tauri command to release cache and call it from frontend whenever/wherever we want.
Yes, I think adding the option for tauri command would make so much sense, the person can offload the models from vram as and when needed.
Also, I was stuck on 1 thing, I installed ollama in the computer. After that, the vibe app has started using CPU for transcription. Not sure why would that be? @thewh1teagle
Yes, I think adding the option for tauri command would make so much sense, the person can offload the models from vram as and when needed.
I thought about the feature design:
I’ve been thinking about the feature design. We could add a toggle option here to enable or disable it just before the logs option.
Then create another tauri command to release the model_context_state
(I think just by set it to None)`
https://github.com/thewh1teagle/vibe/blob/main/desktop/src-tauri/src/cmd/mod.rs#L373
And eventually we can call it here if option enabled in UI (only in home viewModel and not in batch viewModel):
https://github.com/thewh1teagle/vibe/blob/main/desktop/src/pages/home/viewModel.ts#L223
Let me know if you want to add :) Anf if you have questions regarding building
Also, I was stuck on 1 thing, I installed ollama in the computer. After that, the vibe app has started using CPU for transcription. Not sure why would that be?
Weird. Which OS? Generally it uses the GPU? Maybe try the latest beta release, did some fixes
Describe the feature
Thanks for building the awesome product.
I generally like to transcribe using bigger models. Now once I have transcribed, I have to close the app every time as I might want to use the vram for ollama.
Is there a way we can remove it from the VRAM. I would love to contribute if you could define how we can do it.