[Feature Request]: Removing the models from VRAM if not being used

thewh1teagle / vibe

Transcribe on your own!

https://thewh1teagle.github.io/vibe/

MIT License

1.1k stars 70 forks source link

[Feature Request]: Removing the models from VRAM if not being used #248

Open chandeldivyam opened 2 months ago

chandeldivyam commented 2 months ago

Describe the feature

Thanks for building the awesome product.

I generally like to transcribe using bigger models. Now once I have transcribed, I have to close the app every time as I might want to use the vram for ollama.

Is there a way we can remove it from the VRAM. I would love to contribute if you could define how we can do it.

thewh1teagle commented 2 months ago

Thanks for letting know about that I understand the issue. It's indeed take relatively a lot of ram.

That thing is actually a feature, not a bug ;). I added caching mechanism to the model for low end computers since it's heavy for them to load the model. In addition when transcribing batch files (when adding multiple) then it's faster that way.

Maybe we can add option in the settings that will release it from cache once it's finished. But then it will release also in batch transcribe so need to see how to still keep it in caching in batch transcribe. Maybe we can just add tauri command to release cache and call it from frontend whenever/wherever we want.

chandeldivyam commented 2 months ago

Yes, I think adding the option for tauri command would make so much sense, the person can offload the models from vram as and when needed.

Also, I was stuck on 1 thing, I installed ollama in the computer. After that, the vibe app has started using CPU for transcription. Not sure why would that be? @thewh1teagle

thewh1teagle commented 2 months ago

Yes, I think adding the option for tauri command would make so much sense, the person can offload the models from vram as and when needed.

I thought about the feature design:

I’ve been thinking about the feature design. We could add a toggle option here to enable or disable it just before the logs option.

Then create another tauri command to release the model_context_state (I think just by set it to None)`

https://github.com/thewh1teagle/vibe/blob/main/desktop/src-tauri/src/cmd/mod.rs#L373

And eventually we can call it here if option enabled in UI (only in home viewModel and not in batch viewModel):

https://github.com/thewh1teagle/vibe/blob/main/desktop/src/pages/home/viewModel.ts#L223

Let me know if you want to add :) Anf if you have questions regarding building

thewh1teagle commented 2 months ago

Also, I was stuck on 1 thing, I installed ollama in the computer. After that, the vibe app has started using CPU for transcription. Not sure why would that be?

Weird. Which OS? Generally it uses the GPU? Maybe try the latest beta release, did some fixes