i have 6gb of vram left, should be able to run a small LLM? but the gpu is basically always at 100% usage due to it constantly churning out new clips, so i don't know if it can handle another llm. i definitely CANNOT afford to spin up another gpu ðŸ˜
the llm itself takes up a lot of vram, has nothing to do with it constantly generating audio. it uses that much ram regardless of whether it is doing something or not
Any Interestin using caching?
Originally posted by @kennethnym in https://github.com/kennethnym/infinifi/issues/11#issuecomment-2285008188