simonw / llm-gpt4all

Plugin for LLM adding support for the GPT4All collection of models
Apache License 2.0
218 stars 20 forks source link

Allowing setting the device for the model (i.e. gpu) #32

Open scotscotmcc opened 7 months ago

scotscotmcc commented 7 months ago

In gpt4all, when instantiating your model, you can pass a device= param, as laid out in the python bindings readme: https://github.com/nomic-ai/gpt4all/tree/6f38fde80b2a604fa4678779547921e9be48b092/gpt4all-bindings/python. Specifically, they say:

from gpt4all import GPT4All
model = GPT4All("orca-mini-3b-gguf2-q4_0.gguf", device='gpu') # device='amd', device='intel'
output = model.generate("The capital of France is ", max_tokens=3)
print(output)

I don't see anything in llm-gpt4all to pass this along. Being able to would be helpful.

As a short test-case for myself, I did directly edit llm_gpt4all.py to just force in passing this (line 166, just adding device='gpu'), and it seemed to work (it ran the same prompt as I had been doing in ~1/3 of the time and my gpu usage cranked up to 97%). I am running on a linux system with an Nvidia RTX 3070, if that matters.

I'm not sure what the best way to pass this param in would be. At first I thought adding it to the options on the cli (like llm --options device gpu blahblahblah), but it looks like options is focused on what things can be tweaked with the model that are passed to gpt4all's generate method, not instantiating the object in the first place. Not sure if there is anywhere better, though. I haven't looked at how llm passes things to llm_gpt4all, either, so options is probably the best way already to pass it from there to here.