Closed sanipanwala closed 7 months ago
Hi there, you won't be able to use the 70b model in colab because of memory and disk size constraints, unless you're using an enterprise version of colab with significant VRAM?
Could you share what machine type you're using on colab?
Hello @rishsriv ,
I'm using a personal workstation.
workstation is configured with Ubuntu with 24 GB* 2 Nvidia GPUs and physical memory is 192 GB.
Thanks.
Hi there, you will have to add a load_in_4bit=true
parameter to load this in 48GB of VRAM
model = AutoModelForCausalLM.from_pretrained(
"https://huggingface.co/defog/sqlcoder-70b-alpha",
device_map="auto",
load_in_4bit=true
)
With that, you should be able to run this in the VRAM that you have! It will still be slower than running it in fp16, but much faster than running on CPU (which, I suspect, is that was happening earlier)
Hello,
I just downloaded the model defog/sqlcoder-70b-alpha and gave the same prompt mentioned in your example (colab) but I'm not sure why the result is still not showing after 10 minutes.
Thanks.