Hello,
From the very beginning of chapter 6, when trying to running the jupyter notebook locally with my 8GB VRAM gpu card:
Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
device_map="cuda",
torch_dtype="auto",
trust_remote_code=True,
)
Is resulting in the message : OutOfMemoryError: CUDA out of memory. Tried to allocate...
Any workaround is very welcome, for instance a less robust model with almost similar results?
Thanks.
Hello, From the very beginning of chapter 6, when trying to running the jupyter notebook locally with my 8GB VRAM gpu card:
Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained( "microsoft/Phi-3-mini-4k-instruct", device_map="cuda", torch_dtype="auto", trust_remote_code=True, )
Is resulting in the message : OutOfMemoryError: CUDA out of memory. Tried to allocate... Any workaround is very welcome, for instance a less robust model with almost similar results? Thanks.