Open Korner83 opened 1 year ago
Yes, our model should be fittable for 24GB for vicuna13B and 12GB for vicuna7B. You can set the low_resource to false to fully utilize your gpu.
Thanks for the reply!
I have tried it, but looks like it gives me out of memory. Do you have any suggestion how to change max_split_size_mb? Where can I find this parameter? Shouldn't I change vit_precision: "fp16"?
This is the error I get:
Load BLIP2-LLM Checkpoint: c:/Users/polga/MiniGPT-4/model/pretrained_minigpt4.pth
Traceback (most recent call last):
File "c:\Users\polga\MiniGPT-4\demo.py", line 60, in
same issue with my 4090, and the chat result is really bad, don`t know why.
If I use the low_resource = True parameter it works in my case totally fine, just keep the beam search number on 1 otherwise it might give you wrong reply because it needs more vram. I just wanted to have higher GPU utilization, because currently it can use only 40% of my GPU which seams strange.
me too.mine is 3090 . REALLY SLOW.hope it,ll be solved soon
Thanks for the great project, I really like it.
Is it just me or is it normal to have only around 35-45% GPU utilization while it's genereting the reply? All my models are running from m. 2 ssd drives also memory should be fine with 64GB ram. Usually Vram usage stays below 20GB so I was wondering if there is a way to use the full potential in my card.