How can I run local-llama with Multi-GPU

jlonge4 / local_llama

This repo is to showcase how you can run a model locally and offline, free of OpenAI dependencies.

Apache License 2.0

240 stars 39 forks source link

How can I run local-llama with Multi-GPU #6

Open Calmepro777 opened 1 year ago

Calmepro777 commented 1 year ago

Hello,

It is a wonderful work done here!

I would appreciated any guidance on how to run local-llama with Multi-GPU.

Thank you.

Calmepro777 commented 1 year ago

I noticed that my GPU utility and VRAM usage is so low, 2% and ~2GiB respectively, any hint on resolving this problem？Is there a specific hyper-parameter I should set to enable GPU?

Thank you.