LostRuins / koboldcpp

A simple one-file way to run various GGML and GGUF models with KoboldAI's UI
https://github.com/lostruins/koboldcpp
GNU Affero General Public License v3.0
4.36k stars 312 forks source link

Possible bug in the koboldcpp colab (Vision clip issue) #797

Open yoshuzx opened 2 months ago

yoshuzx commented 2 months ago

There may be a bug in the koboldcpp colab. I tried using a vision model like Llava 7B, but when I load one image, the processing is really slow. I noticed that the clip is being loaded on the CPU instead of CUDA by default. Is there a way to change or fix this? Thank you in advance.

LostRuins commented 2 months ago

I can confirm there is a speed regression between 1.61.2 and 1.62

LostRuins commented 2 months ago

I have narrowed it down to the backend changes between 073a279(18 mar) and 8131616 (20 mar), it seems to be somehow initializing cuda in a different order. Unfortunately I ran out of colab usage so I will have to resume testing another day.

If anyone is able to, you can try running the current experimental branch and see if it still has issues. I've made a few fix changes but cannot test them.

LostRuins commented 2 months ago

Hi, Should be fixed in the latest version, please try.